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I. Basis of the report 

1 . This report has been drawn on the basis of (substitute sheets which have been furnished to the receiving Office in 
response to an invitation under Article 14 are referred to in this report as "originally filed" and are not annexed to 
the report since they do not contain amendments.): 

Description, pages: 

1 -42 as originally filed 

Claims, No.: 

1 -54 as originally filed 

Drawings, sheets: 

1 /8-8/8 as originally filed 

2. The annendments have resulted in the cancellation of: 

□ the description, pages: 

□ the claims, Nos.: 

□ the drawings, sheets: 

3. □ This report has been established as if (some of) the amendments had not been made, since they have been 

considered to go beyond the disclosure as filed (Rule 70.2(c)): 

4. Additional observations, if necessary: 
II. Priority 

1 . □ This report has been established as if no priority had been claimed due to the failure to furnish within the 

prescribed time limit the requested: 

□ copy of the earlier application whose priority has been claimed. 

□ translation of the eariier application whose priority has been claimed. 

2. □ This report has been established as if no priority had been claimed due to the fact that the priority claim has 

been found invalid. 
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Thus for the purposes of this report, the international filing date indicated above is considered to be the relevant date. 
3. Additional observations, if necessary: 

see separate sheet 

ill. Non-establishment of opinion with regard to novelty, inventive step and industrial applicability 

The questions whether the claimed invention appears to be novel, to involve an inventive step (to be non-obvious), 
or to be industrially applicable have not been examined in respect of: 

□ the entire international application. 
H claims Nos. 32-40, 53, 54. 

because: 

H the said international application, or the said claims Nos. 32-40, 53, 54 relate to the following subject matter 
which does not require an intemational preliminary examination (specif)^: 

see separate sheet 

□ the description, claims or drawings (indicate particular elemerits beloy\f) or said claims Nos. are so unclear 
that no meaningful opinion could be formed (specify}: 

□ the claims, or said claims Nos. are so inadequately supported by the description that no meaningful opinion 
could be formed. 

□ no intemational search report has been established for the said claims Nos. . 
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V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial 
applicability; citations and explanations supporting such statement 

1. Statement 



Novelty (N) 


Yes: 


Claims 


3-10, 12-31,41-52 




No: 


Claims 


1.2, 11 


Inventive step (IS) 


Yes: 


Claims 


3-10, 12-31,41-52 




No: 


Claims 


1.2, 11 


Industrial applicability (lA) 


Yes: 


Claims 


1-31,41-52 




No: 


Claims 





2. Citations and explanations 
see separate sheet 

VII. Certain defects in the international application 

The following defects in the form or contents of the intemational application have been noted: 
see separate sheet 

VIII. Certain observations on the international application 

The following observations on the clarity of the claims, description, and drawings or on the question whether the 
claims are fully supported by the description, are made: 

see separate sheet 
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Citations 

The documents mentioned in this intemational preliminary examination report 
(IPER) are numbered as in the international search report dated 16.12.98, i.e. D1 
corresponds to the first document of the search report etc. 

Re ITEM II (Priority) 

Since the priority document pertaining to the present application is not yet 
available to the I PEA, this IPER has been drawn up considering the priority date 
(17.06.97) as valid. Documents D6-D13 have been published between the priority 
date and the filing date of the present application. Thus, said documents do not 
constitute prior art in the meaning of Rule 64(1 )(b) PCT. However, if it turns out 
that the effective date of the claimed subject-matter is not the priority date then 
D6-D13 will become relevant to assess whether the present application satisfies 
the criteria set forth in Art. 33(2) and (3) PCT. 

Re ITEM III (Non-establishment of opinion) 

As far as the subject-matter of claims 32-40. 53 and 54 is directed to in vivo 
methods, it is also directed to methods for treatment of the human or animal body 
and thus, excluded from examination by Art. 34(4)(a)(i) PCT in combination with 
Rule 67.1(iv) PCT. 

No unified criteria exists among the PCT member states for the assessment 
whether the treatment of the human or animal body is industrially applicable or 
not. The patentability can also be dependent upon the formulation of the claims. 
The EPO, for example, does not recognize as industrially applicable the subject- 
matter of claims to the use of a compound in medical treatment, but will allow, 
however, claims to a known compound for first use in medical treatment and the 
use of such a compound for the manufacture of a medicament for a new medical 
treatment. 
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Re ITEM V (Novelty, inventive step, industrial applicability) 

1 Summary of the present application 

The present application is related to a member of the Steroid Receptor 
Coactivator-1 (SRC-1) family designated "AIB1" (amplified in breast cancer-1). 
The application is further related to various uses of the AIB1 gene (SEQ ID N0:1) 
and AIB-1 polypeptides (SEQ ID NOs: 2, 3, 4, 8), respectively. 

2 Novelty (Art. 33(2) PCT) 

2. 1 The subject-matter of claims 3-10. 12-31 and 41-52 has not been made available 
to the public by any of the available prior art documents and can therefore be 
regarded as novel. 

2.2 The subject-matter of claims 1. 2 and 11 does not meet the requirements of Art. 
33(2) and 33(3) PCT because D1 already discloses a "substantially pure" DNA 
comprising a sequence encoding a human "AIB1" polypeptide and a cell 
comprising said DNA (see D1, p. 3448, left col., 2nd par.; also see Fig. 2 and Fig. 
3). 

3 Inventive step (Art. 33(3) PCT) 

The subject-matter of claims 3-10. 12-31 and 41-52 cannot be derived from the 
available prior art in an obvious manner and therefore complies with the 
requirements of Art. 33(3) PCT. 

4 Industrial applicability (Art. 33(4) PCT) 

Claims 1-31 and 41-52 meet the criteria as set forth by Art. 33(4) PCT. 
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Re ITEM VII (Certain defects in the international application) 

1 The present application contains such a high number of independent claims (21 
out of 54) that the application as a whole lacks conciseness (Rule 6.1(a) PCI). 
Independent claims which are directed to the same category (or merely worded 
differently) have not been made dependent upon each other to meet the 
requirements of Art. 6 PCT in combination with Rule 6.4 PCT. 

For instance, claims 1 and 7-9 are all directed to "a substantially pure DNA", 
claims 14. 18 and 21 to "a method of identifying a candidate compound" and 
claims 45. 46. 48 and 50 to transgenic animals. Furthermore, maintaining the high 
number of independent claims in the same category may give rise to a non-unity 
objection in regional phase examination. 

2 Dependent claims shall not refer to an "invention" but to the method or product of 
another claim (claims 47. 49. 51. 52) , 

Re ITEM VIII (Clarity and support by the description) 

1 Clarity of the claims (Art. 6 PCT) 

1.1 Rule 6.3(a) PCT requires that the matter for which protection is sought be defined 
in terms of technical features of the invention (also cf. PCT Guidelines III-4.4. as in 
force from 09,10.98). A peptide/nucleic acid (claims 1. 2 and 12) is a chemical 
compound which can be clearly and unambiguously defined by its chemical 
structure, i.e. its amino/nucleic acid sequence (no reference to the appropriate 
SEQ ID NO(s) is given in said claims, see also novelty objection raised under 
point V, 2.2). 

1.2 Additionally, "AIB1" is regarded as an internal designation which does not provide 
a technical teaching to the skilled person. In numerous cases the designation of 
genes or proteins has changed over time. An example of an ambiguous 
designation is given in present application, i.e. the human gene is designated 
"AIB1" wherein the murine gene is called "pCIP" (p. 1 1, 1. 13 of present 
description). Claims referring to a product or a method defined by said 
designations therefore lack clarity. The "AIBI" gene/protein and the "pCIP" gene 
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must be clearly and unambiguously defined (the appropriate SEQ ID NO(s) are 
not included in independent claims 1. 2. 12. 14. 18. 21. 22. 28. 41. 42. 45. 48 and 
50). 

1.3 The degeneracy of the genetic code is only relevant with respect to an encoded 
peptide sequence (claim 9) . Since no such peptide sequence is referred to in said 
claim, reference to the degeneracy of the genetic code is inappropriate. 
Furthermore, it is considered that any DMA might fall under the scope of said 
claims. The Applicant should resolve this issue to satisfy the requirements of Art. 
6 PCT and adapt the description where necessary (e.g. p. 2, 1. 21). 

1 .4 Claims 23. 26 and 27 erroneously refer to claim 21 . 
2 Sufficiency of disclosure (Art. 5 PCT) 

2.1 In view of the homology to SRC-1 (see e.g. D2) the IPEA is of the opinion that to 
obtain a monoclonal antibody which "specifically" binds to human "AIBI" (claim 
41) requires more than normal routine work but a cumbersome selection of 
epitopes specific ior "AIBI" not disclosed in present application (Art. 6 and Art. 5 
PCT). 

2.2 The subject-matter of claim 45 and 47-52 refers to "transgenic animals" in general 
and therefore also includes such animals as humans (with the associated ethical 
and moral problems), fish, reptiles, insects, etc. The present description is not 
enabling for the whole range claimed (general animal kingdom) (see Example 7, 
and p. 23) (also cf. description p. 1 1, 1. 9, "transgenic mammals"). 
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Crystal Plaza 2 

Washington, DC 20231 

ETATS-UNIS D'AMERIQUE 

in its capacity as elected Office 


Date of mailing (day/month/year) 
26 January 1999 (26.01.99) 


International application No. 
PCT/US98/12689 


Applicant's or agent's file reference 
4239-49944 


International filing date (day/month/year) 
17 June 1998 {17.06.98) 


Priority date (day/month/year) 
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Applicant 

MELTZER, Paul et al 



1. The designated Office is hereby notified of its election made: 

I X I in the demand filed with the International Preliminary Examining Authority on: 

11 January 1999 (11.01.99) 



I I in a notice effecting later election filed with the International Bureau on: 



2. The election | X | was 

I I was not 

made before the expiration of 1 9 months from the priority date or, where Rule 32 applies, within the time limit under 
Rule 32.2(b). 



/ 



the International Bul eau of WlPO" , 
34, chemin des C6l<imbettes 
121 i Geneva 20. Switzerland' 



/ Facsimile No.: (41 -? . 2) 740 . 14.35, 
Torm PCT/IB/331 (July '1992) ) 



\ 



Authorized officer 

Lazar Joseph Panakal 
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Applicant's or agent's file reference 

4239-49944 


FOR FURTHER ®®® Notification of Transmittal of International Search Report 

(Form PCT/ISA/220) as well as, where applicable, item 5 below. 

ACTION 


International application No. 

PCT/US 98/12689 


International filing date (day/month/year) 

17/06/1998 


(Earliest) Priority Date (day/month/year) 

17/06/1997 


Applicant 

THE UNITED STATES OF AMERICA REPR et al . 



This International Search Report has been prepared by this International Searching Authority and is transmitted to the applicant 
according to Article 18. A copy is being transmitted to the International Bureau. 



This International Search Report consists of atotal of . 



sheets. 



fxl is also accompanied by a copy of each prior art document cited in this report. 



1 . Certain claims were found unsearchable (see Box I). 

2. Unity of invention is lacking(see Box II). 

3. The international application contains disclosure of a nucleotide and/or amino acid sequence listing and the 
international search was carried out on the basis of the sequence listing 

fxl with the international application. 

I I furnished by the applicant separately from the international application, 

I I but not accompanied by a statement to the effect that it did not include 
matter going beyond the disclosure in the international application as filed. 

I I Transcribed by this Authority 

4. With regard to the title, the text is approved as submitted by the applicant 

[xl the text has been established by this Authority to read as follows: 

AIBl, A STEROID RECEPTOR CO-ACTIVATOR 



With regard to the abstract, 

[xl the text is approved as submitted by the applicant 

I I the text has been established, according to Rule 38.2(b), by this Authority as it appears in 
Box III. The applicant may, within one month from the date of mailing of this International 
Search Report, submit comments to this Authority. 

The figure of the drawings to be published with the abstract is: 

Figure No. . | | as suggested by the applicant. Q None of the figures. 

I I because the applicant failed to suggest a figure. 

I I because this figure better characterizes the Invention. 
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Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 

This International Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 

1. IT] Claims Nos,: 32-40, 53-54 

because they relate to subject matter not required to be searched by this Authority, namely: 

Remark: Although claims 32-40, 53-54 

are directed to a method of treatment of the human/animal 
body, the search has been carried out and based on the alleged 
effects of the compound/composition. 



Claims Nos.: 

because they relate to parts of the International Application that do not comply with the prescribed requirements tosuch 
an extent that no meaningful International Search can be carried out, specifically: 



3. I I Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule6.4(a). 

Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 

This International Searching Authority found multiple inventions in this international application, as follows: 



1. I I As all required additional search fees were timely paid by the applicant, this International Search Report covers all 
' ' searchable claims. 

2. I I As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invitepayment 

of any additional fee. 



3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
' ' covers only those claims for which fees were paid, specifically claims Nos.: 



4. I I No required additional search fees were timely paid by the applicant. Consequently, this International Search Report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 



Remark on Protest | | The additional search fees were accompanied by the applicant's protest. 

I I No protest accompanied the payment of additional search fees. 
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A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C07K14/72 C12N15/12 C12N15/11 C07K16/18 C12Q1/68 
G01N33/53 A01K67/027 A61K38/17 A61K38/18 

According to International Patent Classification (IPC) or to both national classification and IPC 

B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by classification symbols) 

IPC 6 C12N C07K 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data.base consulted during the international search (name of data base and, where practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category ' 



Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



X-Y GUAN ET AL..: "Hybrid selection of 
transcribed sequences from microdi ssected 
DNA: Isolation of genes within an 
amplified region at 20qll-ql3.2 in breast 
cancer" 

CANCER RESEARCH, 

vol. 56, no. 15, 1996, pages 3446-3450, 
XP002088091 

cited in the application 
see the whole document 

-/-- 



1,2 



Further documents are listed in the continuation of box C. 



|)( I Patent family members are listed in annex. 



° Special categories of cited documents ; 

"A" document defining the general state of the art which is not 
considered to be of particular relevance 

"E" earlier document but published on or after the international 
filing date 

"L" document which may throw doubts on priority claim(s) or 
which is cited to establish the publication date of another 
citation or other special reason (as specified) 

"O" document referring to an oral disclosure, use, exhibition or 
other means 

"P" document published prior to the international filing date but 
later than the priority date claimed 



later document published after the international filing date 
or priority date and not in conflict with the application but 
cited to understand the principle or theory underiying the 
invention 

document of particular relevance; the claimed invention 
cannot be considered novel or cannot be considered to 
involve an inventive step when the document is taken alone 

document of particular relevance; the claimed invention 
cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

document member of the same patent family 



Date of the actual completion of the international search 

16 December 1998 


Date of mailing of the international search report 

13/01/1999 


Name and mailing address of the ISA 

European Patent Office, P.B. 5818 Patentlaan 2 
NL - 2280 HV Rijswijk 
Tel. (+31-70) 340-2040. Tx. 31 651 epo nl, 
Fax: (+31-70) 340-3016 


Authorized officer 

Mateo Rosell , A.M. 
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C.(Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 


Category ° 


Citation of docunnent, with indication.where appropriate, of the relevant passages 


Relevant to claim No. 


A 

A 


l.in Q7 in'5'57 A /DAVI AD PAI I CPC Ml^^\T^'TMC■^ 

WU y/ iUoo/ A VoAYLUK LULLLbL rltUILliNL; 


1,2, 




on Mo^r^h 1QQ7 


1 A 1 O 

10-12 , 






15-20 , 






22-28, 






32-40, 






43-45,53 




see page 5, line 10 - page 6, line 28 






see page 15, line 20 - page 17, line 5 






see page 15, line 16-22 






see page 19, line 6 - page 20, line 28 




A 


GLASS C K ET AL: "NUCLEAR RECEPTOR 


1 




COACTIVATORS" 






CURRENT OPINION IN CELL BIOLOGY, 






vol, 9, no. 2, April 1997, pages 222-232, 






XP002045759 






see the whole document 




A 

A 


OGRYZKO V V ET AL: THE TRANSCRIPTIONAL 


53,54 




COACTIVATORS P300 AND CBP ARE HISTONE 




ACETYLTRANSFERASES" 






r* 1 1 

CELL, 






vol. 87, no. 5, 29 November 1996, pages 






953-959 , Xr002050401 






see specially page 953 




A 


WO 95 21940 A (SALK INST FOR BIOLOGICAL 


53,54 




STUDIES) 17 August 1995 




see abstract 






see page 5, line 7 - page 8, line 18; 






examples I-IV 




P,A 


DATABASE EMBL NUCLEOTIDE AND PROTEIN 


46 




SEQUENCES, - 1 July 1997 XP002088092 






HINXTON, GB 






AC= 009000. P300/CBP/Co-integrator protein 






Mus musculus. 






see abstract 




P,A 


-& J. TORCHIA ET AL. , : The 


46 




transcriptional co-activator p/CIP binds 






CBP and mediates nuclear-receptor 






function" 






NATURE, 






vol. 387, no. 6634, 1997, pages 677-684, 






XP002088153 






see the whole document 






-/-- 
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C.(Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 


Category ° 


Citation of document, with indicatlon.where appropriate, of the relevant passages 


Relevant to claim No. 


P,x 


S.L. ANZICK ET AL.,: "AIBl, a steroid 
receptor coactivator amplified in breast 
and ovarian cancer " 
SCIENCE, 

vol, 277, no. 5328, 15 August 1997, pages 

965-968, XP002088093 

Washington, DC, US 

cited in the application 

see the whole document and specially 

Figure X 




1,7-9 


P,X 


H. LI ET AL., : "RAC3, a steroid/nuclear 
receptor-associated coactivator that is 
related to SRC-1 and TIF-2" 
PROCEEDINGS OF THE NATIONAL ACADEMY OF 
SCIENCES, 

vol. 94, 1 August 1997, pages 8479-8984, 

XP002088094 

WASHINGTON DC, US 

see the whole document and specially 
Figure y 




1,7-9 


P,X 


A. TAKESHITA ET AL,, : "TRAM-1, a novel 
160-kDa thyroid hormone receptor activator 
molecule, exhibits distinct properties 
from steroid receptor coacti vator-l" 
JOURNAL OF BIOLOGICAL CHEMISTRY , 
vol. 272, 31 October 1997, pages 
27629-27634, XP002088095 
Bethesda, MD US 

see the whole document and specially 
Figure Z 




1,7-9 


P,X 


H. CHEN ET AL.,: "Nuclear receptor 
coactivator ACTR is a novel hi stone 
acetyl transferase and formsa multimeric 
activation complex with P/CAF and 
CBP/p300" 
CELL, 

vol. 90, no. 3, 8 August 1997, pages 
569-580, XP002088096 
see the whole document and specially 
Figure W 




1,7-9 


P,X 


FOROZAN F ET AL: "Genome screening by 
comparative genomic hybridization" 
TRENDS IN GENETICS, 
vol. 13, no. 10, October 1997, page 
405-409 XP004090560 

see the whole document and specially page 
407, column 1 




1 


P,X 


WO 98 03652 A (US HEALTH) 29 January 1998 
see page 3, line 1 - page 6, line 10 
see page 33, line 15-28 




53,54 
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WO 97 10337 A (BAYLOR COLLEGE MEDICINE) 
20 March 1997 



see page 5, line 10 - page 6, line 28 
see page 15, line 20 - page 17, line 5 
see page 15, line 16-22 
see page 19, line 6 - page 20, line 28 
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953-959, XP002050401 
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see abstract 

see page 5, line 7 - page 8, line 18: 
examples I-IV 

DATABASE EMBL NUCLEOTIDE AND PROTEIN 
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see abstract 
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AIBl, A NOVEL STEROID RECEPTOR CO-ACTIVATOR 

BACKGROUND OF THE INVENTION 

Breast cancer arises from estrogen-responsive breast epithelial cells. Estrogen activity is 
5 thought to promote the development of breast cancer, and many breast cancers are initially _ 
dependent on estrogen at the time of diagnosis. Anti-estrogen compositions have therefore been 
used to treat breast cancer. 

A frequent mechanism of increased gene expression in human cancers is amplification, i.e., 
the copy number of a DNA sequence is increased, in a cancer cell compared to a non-cancerous 
10 cell. In breast cancer, commonly amplified regions are derived from 17q21, 8q24, and llql3 

which encode erbB-2, c-myc, and cyclic Dl respectively (Devilee et al., 1994, Crii. Rev, Oncog. 
5:247-270). Recently, molecular cytogenetic studies have revealed the occurrence in breast cancers 
of additional regions of increased DNA copy number (Isola ei al.. Am. J. Pathol. 147:905-911, 
1995; Kallioniemi ei al., Proc. Natl. Acad. Sci. USA 91:2156-2160, 1994; Muleris et al., Genes 
15 Chromo. Cancer 10:160-170, 1994; Tanner et al.. Cancer Research 54:4257^260, 1994; Guan et 
al., Nat. Genet. 8:155-161, 1994). 

Breast cancer is the second leading cause of cancer deaths in American women, and it is 
estimated that an American woman has at least a 10% ciunulative lifetime risk of developing this 
disease. Early diagnosis is an important factor in breast cancer prognosis and affects not only 
20 survival rate, but the range of therapeutic options available to the patient. For instance, if 

diagnosed early, a "lumpectomy" may be performed, whereas later diagnosis tends to be associated 
with more invasive and traumatic surgical treatments such as radical mastectomy. The treatment of 
other cancers likewise is benefitted by early diagnosis, for instance the prognosis in the treatment of 
lung cancer, colorectal cancer and prostate cancers is greatly improved by early diagnosis. There 
25 is a need for a simple and reliable method of diagnosis of cancers in general and of breast cancer in 
particular. There is a need for a method of screening for compounds that inhibit the interaction 
between an estrogen receptor ER and an ER-dependent nuclear receptor co-activator molecule in 
order to identify molecules useftd in research diagnosis and treatment of cancer. There is also a 
need for a method for identifying tamoxifen-sensitive cancer patients in order to better manage 
30 treatment. A solution to these needs would improve cancer treatment and research and would save 
lives. 

SUMMARY OF THE INVENTION 

The inventors have discovered that the AIBl protein (Amplified In Breast Cancer- 1) is a 
35 member of the Steroid Receptor Coactivator - 1 (SRC-1) family of nuclear receptor co-activators 
that interacts with estrogen receptors (ER) to enhance ER-dependent transcription. The inventors „ 
have further discovered that the AIBl gene is amplified and over-expressed in certain cancers 
including breast cancer, and that detection of amplified AIBl genes can therefore be used to detect 
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cancerous cells. Importantly, the inventors have also found that AIBl amplification is not confined 
to breast cancer but is also foimd in cancers of the lung, ovary, head and neck, colon, testicles, 
bladder, prostate, endometriiun, kidney, stomach and also in pheochromocytoma, melanoma, 
ductal carcinoma and carcinoid tumor. Such a finding means that AIBl may be useful in the 

5 detection and treatment of all of the aforementioned cancers which include some of the most- 
prevalent and deadly diseases in the western world. 

The inventors have also discovered that AIBl interacts with the proteins p300 and CBP, 
which are nuclear cofactors that interact with other nuclear factors to promote transcription 
(Chacravarti et al., Nature (383) 99-103 1996; Limdblad et al.. Nature (374) 85-88 1995). The 

10 inventors have, furthermore, determined that in cells with stable over-expression of AIBl, there is 
a dramatic increase in steroid receptor activation (almost a l(X)-fold increase) leading to a 
corresponding increase in transcriptional activation. The inventors have also used monoclonal anti- 
AIBl antibodies to demonstrate that AIBl gene amplification is directly correlated with increased 
AIBl expression, and that these amplified copies of the gene are expressed in physiological 

15 conditions. The inventors have found that AIBl is the human ortholog of the mouse ER-dependent 
transcriptional activator p/CIP, with the proteins having an overall amino acid identity of 81.6%. 
These finding support the physiological role for AIBl in cancer cells as a cofactor involved in 
transcriptional regulation. 

The invention features a substantially pure DNA which includes a sequence encoding an 

20 AIBl polypeptide, e.g., a human AIBl polypeptide, or a fragment thereof. The DNA may have 
the sequence of all or part of the naturally-occurring AIBl -encoding DNA or a degenerate variant 
thereof. AIBl -encoding DNA may be operably linked to regulatory sequences for expression of the 
polypeptide. A cell containing AIBl encoding DNA is also within the invention. 

The invention also includes a substantially pure DNA containing a polynucleotides which 

25 hybridizes at high stringency to a AIBl -encoding DNA or the complement thereof. A substantially 
pure DNA containing a nucleotide sequence having at least 50% sequence identity to the full length 
AIBl cDNA, e.g., a nucleotide sequence encoding a polypeptide having the biological activity of a 
AIBl polypeptide, is also included. 

The invention also features a substantially pure human AIBl polypeptide and variants 

30 thereof, e.g., polypeptides with conservative amino acid substitutions or polypeptides with 

conservative or non-conservative amino acid substitutions which retain the biological activity of 
naturally-occurring AIB 1 . 

Diagnostic methods, e.g., to identify cells which harbor an abnormal copy number of the 
AIBl DNA, are also encompassed by the invention. An abnormal copy nimiber, e.g., greater than 

35 the normal diploid copy nimiber, of AIBl DNA is indicative of an aberrantly proliferating cell, 
e.g., a steroid hormone-responsive cancer cell. 

The invention also includes antibodies, e.g., a monoclonal antibody or polyclonal antisera, 
which bind specifically to AIBl and can be used to detect the level of expression of AIBl in a cell 
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or tissue sample. An increase in the level of expression of AIBl in a patient-derived tissue sample 
compared to the level in normal control tissue indicates the presence of a cell proliferative disorder 
such as cancer. 

Screening methods to identify compounds which inhibit an interaction of AIBl with a steroid 
5 hormone receptor, thus disrupting a signal transduction pathway which leads to aberrant cell- 
proliferation, is also within the invention. Proliferation of a cancer cell can therefore be reduced 
by administering to an individual, e.g., a patient diagnosed with a steroid-responsive cancer, a 
compound which inhibits expression of AIBl. 

The invention also includes a knockout mutant, for example a mouse (or other manunal) 

10 from which at least one AIBl gene has been selectively deleted from its genome. Such a mouse is 
useful in research, for instance, the phenotype gives insight into the physiological role of the 
deleted gene. For instance the mutant may be defective in specific biochemical pathways; such a 
knockout mutant may be used in complementation experiments to determine the role of other genes 
and proteins to determine if any such genes or proteins complement for the deleted gene. 

15 Homozygous and heterozygous mutants are included in this aspect of the invention. 

The present invention also includes a mutant organism, for example a mammal such as a 
mouse which contains more than the normal number of AIBl genes in its genome. Such a mouse 
may contain additional copies of the AIBl gene integrated into its chromosomes, for instance in the 
form of a pro-virus, or may carry additional copies on extra-chromosomal elements such as 

20 plasmids. Such a mutant mouse is useful for research purposes, to elucidate the physiological or 
pathological role of AIBl. For instance, the role of AIBl expression as cause or effect in cancers 
may be investigated by including or transplanting tumors into such mutants, and comparing such 
mutants with normal mice having the same cancer. 

The present invention also includes a mutant organism, for example a manmial, e.g. a 

25 mouse, that contains, either integrated into a chromosome or on a plasmid, at least one copy of the 
AIBl gene driven by a non-native promoter. Such a promoter may be constitutive or may be 
inducible. For instance, the AIBl gene may be operatively linked to a mouse mammary tumor 
virus (MMTV) promoter or other promoter from a manmialian virus allowing manipulation of 
AIBl expression. Such a mutant would be useful for research purposes to determine the 

30 physiological or pathological role of AIBl. For instance, over or under expression could be 
affected and physiological effects observed. 

The invention also includes methods for treatment of cancers that involve functions of or 
alterations in the signaling pathways that use p300 and/or CBP as signal transducing molecules. 
The treatments of the invention involve targeting of the AIBl protein or AIBl gene to enhance or 

35 reduce interaction with p300 and/or CBP proteins. For instance, the AIBl gene sequence as 

disclosed herein may be used to construct an anti-sense nucleotide. An anti-sense RNA may be - 
constructed that is anti-parallel and complementary to the AIBl transcript (or part thereof) and 
which will therefore form an RNA-RNA duplex with the AIBl transcript, preventing transcription 
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and expression of AIBl . Alternatively, treatments may comprise contacting an AIBl protein with a 
molecule that specifically binds to the AIBl molecule in vivo, thereby interfering with AIBl 
binding with other factors such as p300 or CBP. Such processes are designed to inhibit signal 
transduction pathways involving AIBl, p300, CBP and other factors and therefore inhibit cancer 

5 cell proliferation that is effected via these pathways. As explained in more detail below, AIBl 
overexpression results in increased ER-dependent transcriptional activity which confers a growth 
advantage upon AIBl amplification-bearing clones during the development and progression of 
estrogen-dependent cancers. 

Compounds which inhibit or disrupt the interaction of an AIBl gene product with a steroid 

0 hormone receptor, e.g., ER, are useful as anti-neoplastic agents for the treatment of patients 

suffering from steroid hormone-responsive cancers such as breast cancer, ovarian cancer, prostate 
cancer, and colon cancer. 

AIBl polypeptides or peptide mimetics of such polypeptides, e.g., those containing domains 
which interact with steroid hormone receptors, can be administered to patients to block the 

5 interaction of endogenous intracellular AIBl and a steroid hormone receptor, e.g., ER in an 
aberrantly proliferatmg cell. It is likely that AIBl interacts with a wide range of himian 
transcriptional factors and that regulation of such interactions will have important therapeutic 
applications. 

Other features and advantages of the invention will be apparent from the following 
0 description of the preferred embodiments thereof, and from the claims. 

SEQUENCE LISTING 
The nucleic acid and amino acid sequences listed in the accompanying Sequence Listing are 
shown using standard letter abbreviations for nucleotide bases and three-letter code for amino acids. 
15 Only one strand of each nucleic acid sequence is shown, but the complementary strand is 
imderstood to be included by any reference to the displayed strand. 

SEQ. LD. No. 1 shows the nucleic acid sequence of the human AIBl cDNA and the 
corresponding amino acid sequence. 

SEQ. I.D. No. 2 shows the amino acid sequence of the Per/Amt/Sim (PAS) domain of 

10 AIBl. 

SEQ. I.D. No. 3 shows the amino acid sequence of the basic helix-loop-helix domain 
(bHLH) of AIBl. 

SEQ. I.D. No. 4 shows the amino acid sequence of the human AIBl protein. 

SEQ. I.D. No. 5 shows the nucleic acid sequence of primer N8F1. 
15 SEQ. I.D. No. 6 shows the nucleic acid sequence of the forward primer designed from the 

5* sequence of pCMVSPORT-Bll, PM-U2. - 

SEQ. I.D. No: 7 shows the nucleic acid sequence of the reverse primer designed from the 5* 
sequence of pCMVSPORT-Bll, PM-U2. 
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SEQ. LD. No. 8 shows the amino acid sequence of the ER-interacting domain of AIBl. 

SEQ. LD. No. 9 shows the nucleic acid sequence of pCIP, the mouse orthoiog of AIBl and 
the amino acid sequence for this gene. 

SEQ. LD. No. 10 shows the nucleic acid sequence of the forward primer AIBl/mESTFl 
5 used to screen mouse BAC. ^ 

SEQ. LD, No. 11 shows the nucleic acid sequence of the reverse primer AIBl/mESTRl 
used to screen mouse BAC. 

SEQ. LD. No. 12 shows the amino acid sequence of pCIP, the mouse orthoiog of AIBl. 



10 FIGURES 

Fig. lA is a diagram of an amino acid sequence of full length AIBl in which residues 
highlighted m black are identical in AIBl, TIF2 and SRCl. Residues identical with TIF2 
(GenBank accession number X97674) or SRC-1 (GenBank accession number U59302) are 
highlighted in grey or boxed, respectively. 
15 Fig, IB is a diagram showing the structural features of AIBl . The following domains are 

indicated: bHLH domain, PAS domains (with the highly conserved PAS A and B regions shown in 
dark gray), S/T (serine/threonine)-rich regions, and a group of charged residues (+/-). A 
glutamine-rich region and polyglutamine tract are also indicated. The numbers beneath the diagram 
indicate the location (approximate residue number) of the domain with respect to the amino acid 
20 sequence shown in Fig. lA. The alignment was generated using DNASTAR software. 

Fig. 2 is a photograph of a Northern blot analysis showing increased expression of AIBl in 
the cell lines BT-474, ZR-75-1, MCF7, and BG-1. 

Fig. 3 is a bar graph showing that the addition of full length AIBl DNA to a cell resulted in 
an increase of estrogen-dependent transcription from an ER reporter plasmid. COS-1 cells were 
25 transiently transfected with 250 ng ER expression vector (pHEGO-hyg), 10 ng of luciferase 
reporter plasmid (pGL3.luc.3ERE or 10 ng pGL3 lacking ERE) and increasmg amounts of 
pcDNA3.1-AIBl and incubated in the absence (open bars) or presence of 10 nM 17P-stradiol (E2, 
solid bars) or 100 nM 4-hydroxy tamoxifen (hatched bars). Luciferase activity was expressed m 
relative luminescence units (RLU). The data are the mean of three determinations from one of foxu* 
30 replicate experiments. Error bars indicate one standard deviation. 

Fig. 4 is a schematic diagram comparing the DNA and protein structures of pCIP (the 
mouse orthoiog of AIBl) and the human AIBl; exons are shown as black boxes. 

Fig. 5 is a table showing the introns and exons of the mouse AIBl gene (pCIP). The "Exon" 
column refers to the number of the exon; "cDNA bp 5'-exon" refers to the nucleotide position in 
35 the mouse cDNA sequence for the 5' exon. "3* intron splice cite" refers to the last few nucleotides 
of the 3' position of the intron. "Exon sequence" refers to the exon itself. "5* intron" refers to the- 
adjacent intron reading from the exon into the splice donor elinucleotides (usually GT). 
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Fig. 6 is a table showing the introns and exons of the human AIBl gene. The "Exon" 
column refers to the number of the exon; "cDNA bp 5 '-exon" refers to the nucleotide position in 
the mouse cDNA sequence for the 5' exon. "3* intron splice cite" refers to the last few nucleotides 
of the 3* position of the intron. "Exon sequence" refers to the exon itself. "5' intron" refers to the 
5 adjacent intron reading from the exon into the splice donor nucleotides (usually GT). 

DETAILED DESCRIPTION 

The invention is based on the discovery of a novel gene, amplified in breast cancer-l 
(AIBl), which is overexpressed in breast cancer. AIBl has the structural features of a co-activator 

10 of the steroid hormone receptor family. The steroid hormone estrogen and other related steroid 
hormones act on cells through specific steroid receptors. 

Members of the steroid receptor coactivator (SRC) family of transcriptional co-activaiors 
interact with nuclear hormone receptors to enhance iigand-dependent transcription. AIBl is a novel 
member of the SRC family which was found to be overexpressed in breast cancers. The AIBl 

15 gene is located at human chromosome 20q. High-level AIBl amplification and overexpression 
were observed in several estrogen receptor (ER) positive breast and ovarian cancer cell lines, as 
well as in uncultured breast cancer specimens. AIBl amplification is not confmed to breast cancer 
but is also found in cancers of the lung, ovary, head and neck, colon, testicles, bladder, prostate, 
endometrium, kidney, stomach and also in pheochromocytoma, melanoma, ductal carcinoma and 

20 carcinoid tumor. 

Transfection of AIBl into cells resulted in marked enhancement of estrogen-dependent 
transcription. These observations indicated that AIBl functions as a co-activator of steroid 
hormone receptors such as ER (including estrogen receptor a (ERa) and estrogen receptor p 
(ERP)), androgen receptor (e.g., expressed in prostate cells), retinoid receptor (e.g., isoforms a, 

25 Y. and retinoid X receptor (RXR)), progesterone receptor (e.g., expressed in breast cells), 
mineralocorticoid receptor (implicated in salt metabolism disorders), vitamin D receptor 
(implicated in calcium metabolism disorders), thyroid hormone receptor (e.g, thyroid hormone 
receptor a), or glucocorticoid receptor (e.g., expressed in spleen and thymus cells). The altered 
expression of AIBl contributes to the initiation and progression of steroid hormone-responsive 

30 cancers by increasing the transcriptional activity of the steroid receptor. 

A substantially pure DNA which includes an AIBl -encoding polynucleotides (or the 
complement thereof) is claimed. By "substantially pure DNA" is meant DNA that is free of the 
genes which, in the naturally-occurring genome of the organism from which the DNA of the 
invention is derived, flank the AIBl gene. The term therefore mcludes, for examjple, a 

35 recombinant DNA which is incorporated into a vector, into an autonomously replicating plasmid or 
virus, or into the genomic DNA of a prokaryote or eukaryote at a site other than its natural site; or 
which exists as a separate molecule (e.g., a cDNA or a genomic or cDNA fragment produced by 
PCR or restriction endonuclease digestion) independent of other sequences. It also includes a 
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recombinant DNA which is part of a hybrid gene encoding an additional polypeptide sequence. 
Preferably, the polypeptide includes a Per/Amt/Sim (PAS) domain 

(LLQALDGFLFVVNRDGNIVFVSENVTQYLQYKQEDLVNTSVYNILHEEDRKDFLKNLPKST 
VNGVSWTNETQRQKSHTFNCRMLMKTPHDILEDINASPEMRQRYETMQCFALSQPRAMME 
5 EGEDLQSCMICVARRITTGERTFPSNPESFITRHDLSGKVVNIDTNSLRSSMRPGFEDIIRRCIQ 
; SEQ. I.D. NO. 2) and/or a basic helix-loop-helix 

(bHLH) domain (RKRKLPCDTPGQGLTCSGEKRRREQESKYIEELAELISANLSDIDNFNVKPD 
KCAILKETVRQIRQIKEQGKT; SEQ. I.D. NO. 3); more preferably, the AIBl polypeptide 
includes the amino acid sequence of the entire naturally-occurring AIBl protein (Fig, 1; SEQ. I.D. 
10 NO. 4). Preferably, the peptide includes an ER-mteracting domam of AIBl (e.g., a domain 
comprising approximately amino acids 300 to 1250: 

CIQRFFSLNDGQSWSQKRHYQEAYLNGHAETPVYRFSLADGTIVTAQTKSKLF 
RNPVTNDRHGFVSTHFLQREQNGYRPNPNPVGQGIRPPMAGCNSSVGGMSMS 
PNQGLQMPSSRAYGLADPSTTGQMSGARYGGSSNIASLTPGPGMQSPSSYQNNNYGLNMSS 

15 PPHGSPGLAPNQQNIMISPRNRGSPKIASHQFSPVAGVHSPMASSGNTGNHSFSSSSLSALQAI 
SEGVGTSLLSTLSSPGPKLDNSPNMNITQPSKVSNQDSKSPLGFYCDQNPVESSMCQSNSRDH 
LSDKESKESSVEGAENQRGPLESKGHKKLLQLLTCSSDDRGHSSLTNSPLDSSCKESSVSVTS 
PSGVSSSTSGGVSSTSNMHGSLLQEKHRILHKLLQNGNSPAEVAKITAEATGKDTSSITSCGD 
GNVVKQEQLSPKKKENNALLRYLLDRDDPSDALSKELQPQVEGVDNKMSQCTSSTIPSSSQE 

20 KDPKIKTETSEEGSGDLDNLDAILGDLTSSDFYNNSISSNGSHLGTKQQVFQGTNSLGLKSSQ 
SVQSIRPPYNRAVSLDSPVSVGSSPPVKNISAFPMLPKQPMLGGNPRMMDSQENYGSSMGGP 
NRNVTVTQTPSSGDWGLPNSKAGRMEPMNSNSMGRPGGDYNTSLPRPALGGSIPTLPLRSN 
SIPGARPVLQQQQQMLQMRPGEIPMGMGANPYGQAAASNQLGSWPDGMLSMEQVSHGTQ 
NRPLLRNSLDDLVGPPSNLEGQSDERALLDQLHTLLSNTDATGLEEIDRALGIPELVNQGQA 

25 LEPKQDAFQGQEAAVMMDQKAGLYGQTYPAQGPPMQGGFHLQGQSPSFNSMMNQMNQQ 
GNFPLQGMHPRANIMRPRTNTPKQLRMQLQQRLQGQQFLNQSRQALELKMENPTAGGAA 
VMRPMMQPQQGFLNAQMVAQRSRELLSHHFRQQRVAMMMQQQQQQQ (SEQ. I.D. NO. 
8). A cell containing substantially purified AIBl -encoding DNA is also within the invention. 

The invention also includes a substantially pure DNA which contains a polynucleotide which 

30 hybridizes at high stringency to an AIBl cDNA having the sequence of SEQ. I.D. NO. 1, or the 
complement thereof and a substantially pure DNA which contains a nucleotide sequence having at 
least 50% (for example at least 75%, 90%,95%, or 98-100%) sequence identity to SEQ. I.D. NO. 
1, provided the nucleotide sequence encodes a polypeptide having the biological activity of a AIBl 
polypeptide. By "biological activity" is meant steroid receptor co-activator activity. For example, 

35 allelic variations of the naturally-occurring AIBl-encoding sequence (SEQ. I.D. NO. 1) are 

encompassed by the invention. Sequence identity can be determined by comparing the nucleotide 
sequences of two nucleic acids using the BLAST sequence analysis software, for instance, the 
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NCBI gapped BLAST 2.0 program set to default parameters. This software is available from The 
National Center for Biotechnology Information {www.ncbi.nlm, nih.gov/BLAST). 

Hybridization is carried out using standard techniques such as those described in Ausubel et 
al.. Current Protocols in Molecular Biology, John Wiley & Sons, (1989). *'High stringency" refers 
5 to DNA hybridization and wash conditions characterized by high temperature and low salt - 

concentration, e.g., wash conditions of 65° C at a salt concentration of approximately 0. 1 X SSC. 
"Low" to "moderate" stringency refers to DNA hybridization and wash conditions characterized by 
low temperature and high salt concentration, e.g. wash conditions of less than 60° C at a salt 
concentration of at least 1.0 X SSC. For example, high stringency conditions may include 

10 hybridization at about 42 ''C, and about 50% formamide; a first wash at about 65 "^C, about 2X 
SSC, and 1 % SDS; followed by a second wash at about 65'*C and about 0. 1 % x SSC. Lower 
stringency conditions suitable for detecting DNA sequences having about 50% sequence identity to 
an AIBl gene are detected by, for example, hybridization at about 42 °C in the absence of 
formamide; a first wash at about 42°C, about 6X SSC, and about 1 % SDS; and a second wash at 

15 about 50°C, about 6X SSC, and about 1 % SDS. 

A substantially pure DNA including (a) the sequence of SEQ ID NO. 1 or (b) a degenerate 
variant thereof is also within the invention. The AIBl -encoding DNA is preferably operably linked 
to regulatory sequences (including, e.g., a promoter) for expression of the polypeptide. 

By "operably linked" is meant that a coding sequence and a regulatory sequence(s) are 

20 connected in such a way as to permit gene expression when the appropriate molecules 
(e.g., transcriptional activator proteins) are bound to the regulatory sequence(s). 

The invention also includes a substantially pure human AIBl polypeptide or fragment 
thereof. The AIBl fragment may include an ER-interaction domain such as one having the amino 
acid sequence of SEQ. I.D. NO. 8. Alternatively, the fragment may contain the amino acid 

25 sequence of SEQ. LD, NOS. 2. 3, or 4. 

Screening methods to identify candidate compounds which inhibit estrogen-dependent 
transcription, AIBl expression, or an AIBl/ER interaction (and as a result, proliferation of steroid 
hormone-responsive cancer cells) are within the scope of the invention. For example, a method of 
identifying a candidate compound which inhibits ER-dependent transcription is carried out by 

30 contacting the compound with an AIBl polypeptide and determining whether the compound binds to 
the polypeptide. Binding of the compound to the polypeptide indicates that the compoxmd inhibits 
ER-dependent transcription, and in turn, proliferation of steroid hormone-responsive cancer cells. 
Preferably, the AIBl polypeptide contains a PAS domain or a bHLH domain. Alternatively, the 
method is carried out by contacting the compound with an AIBl polypeptide and an ER polypeptide 

35 and determining the ability of the compound to interfere with the binding of the ER polypeptide 

with the AIBl polypeptide. A compoimd which mterferes with an AIBl/ER interaction inhibits - 
ER-dependent transcription. 
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A method of screening a candidate compound which inhibits an interaction of an AIBl 
polypeptide with an ER polypeptide in a cell includes the steps of (a) providing a GAL4 binding site 
linked to a reporter gene; (b) providing a GAM bindmg domain linked to either (i) an AIBl 
polypeptide or (ii) an ER polypeptide; (c) providing a GAL4 transactivation domain II linked to the 
5 ER polypeptide if the GAL4 binding domain is linked to the AIBl polypeptide or linked to-the 

AIBl polypeptide if the GAL4 binding domain is linked to the ER polypeptide; (d) contacting the 
cell with the compound; and (e) monitoring expression of the reporter gene. A decrease in 
expression in the presence of the compoimd compared to that m the absence of the compound 
indicates that the compound inhibits an interaction of an AIBl polypeptide with the ER polypeptide. 

10 Diagnostic methods to identify an aberrantly proliferating cell, e.g., a steroid hormone- 

responsive cancer cell such as a breast cancer cell, ovarian cancer cell, or prostate cancer cell, are 
also included in the invention. For example, a method of detecting an aberrantly proliferating cell 
in a tissue sample is carried out by determining the level of AIBl gene expression in the sample. 
An increase in the level of gene expression compared to that in a normal control tissue indicates the 

15 presence of an aberrantly proliferating cell. AIBl gene expression is measured using an AIBl 

gene-specific polynucleotides probe, e.g. in a Northern assay or polymerase chain reaction (PCR)- 
based assay, to detect AIBl mRNA transcripts. AIBl gene expression can also be measured using 
an antibody specific for an AIBl gene product, e.g., by immunohistochemistry or Western blotting. 
Aberrantly proliferating cells, e.g., cancer cells, in a tissue sample may be detected by 

20 determining the number of cellular copies of an AIBl gene in the tissue. An increase in the 
number of gene copies in a cell of a patient-derived tissue, compared to that in normal control 
tissue indicates the presence of a cancer. A copy number greater than 2 (the normal diploid copy 
number) is indicative of an aberrantly proliferative cell. Preferably, the copy number is greater 
than 5 copies per diploid genome, more preferably 10 copies, more preferably greater than 20, and 

25 most preferably greater than 25 copies. An increase in copy number compared to the normal 
diploid copy number indicates that the tissue sample contains aberrantly proliferating steroid 
hormone-responsive cancer cells. AIBl copy number is measured by fluorescent in situ 
hybridization (FISH), Southern hybridization techniques, and other methods well known in the art 
(Kallioniemi et al., PNAS 91: 2156-2160 (1994); Guan et al.. Nature Genetics 8: 155-161 (1994); 

30 Tanner et al., Clin, Cancer Res. 1: 1455-1461 (1995); Guan et al.. Cancer Res, 56: 3446-3450 
(August 1996); Anzick et al.. Science 277: 965-968 (August 1997)). 

Aberrantly proliferating cells can also be identified by genetic polymorphisms in the 
polyglutamine tract of AIBl, e.g., variations in the size of this domain which alter AIBl co- 
activator activity. 

35 The invention also includes methods of treating a mammal, e.g., a human patient. For . 

example, a method of reducing proliferation of a steroid hormone-responsive cancer cell, e.g., an 
estrogen-responsive breast cancer cell, in a mammal is carried out by administering to the mammal 
a compound which inhibits expression of AIBl . The compoimd reduces transcription of AIBl- 
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encoding DNA in the cell. Alternatively, the compound reduces translation of an AIBl mRNA into 
an AIBl gene product in the cell. For example, translation of AIBl mRNA into an AIBl gene 
product is inhibited by contacting the mRNA with antisense polynucleotides complementary to the 
AIBl mRNA. 

5 A method of inhibiting ER-dependent transcription in a breast cell of a manmial is carried 

out by administering an effective amoimt of an AIBl polypeptide or a peptide mimetic thereof to 
the manmial. Preferably, the polypeptide inhibits an AIBl/ER interaction; more preferably, the 
polypeptide contains an ER-interacting domain; a PAS domain or a bHLH domain of AIBl. By 
binding to ER, such a polypeptide inhibits binding of AIBl to ER, thereby inhibitmg ER-dependent 
10 transcription. 

The invention also includes antibodies, e.g., a monoclonal antibody or polyclonal antisera, 
which bind specifically to AIBl. The term "antibody" as used in this invention includes whole 
antibodies as well as fragments thereof, such as Fab, Fab\ F(ab*)2, and Fv which bind to an AIBl 
epitope. These antibody fragments are defined as follows: (1) Fab, the fragment which contains a 

15 monovalent antigen-binding fragment of an antibody molecule produced by digestion of whole 

antibody with the enzyme papain to yield an intact light chain and a portion of one heavy chain; (2) 
Fab\ the fragment of an antibody molecule obtained by treating whole antibody with pepsin, 
followed by reduction, to yield an intact light chain and a portion of the heavy chain; two Fab* 
fragments are obtained per antibody molecule; (3) (Fab')2, the fragment of the antibody obtained by 

20 treating whole antibody with the enzyme pepsin without subsequent reduction; F(ab*)2, a dimer of 
two Fab* fragments held together by two disulfide bonds; (4) Fv, a genetically enguieered fragment 
containing the variable region of the light chain and the variable region of the heavy chain 
expressed as two chains; and (5) single chain antibody ("SCA"), a genetically engineered molecule 
containing the variable region of the light chain, the variable region of the heavy chain, linked by a 

25 suitable polypeptide linker as a genetically fused single chain molecule. Methods of making these 
fragments are routine. 

Also within the invention is a method of identifying a tamoxifen-sensitive patient (one who is 

likely to respond to tamoxifen treatment by a reduction in rate of tumor growth) wherein the 

method includes the steps of (a) contacting a patient-derived tissue sample with tamoxifen; and (b) 
30 determining the level of AIBl gene expression or amplification in the sample. An increase in the 

level of expression or gene copy number compared to the level or cellular copy number in normal 

control tissue indicates that the patient is tamoxifen-sensitive. 

AIBl gene expression is measured using an AIBl gene-specific polynucleotide probe, e.g., 

in a Northern blot or PCR-based assay to detect AIBl mRNA transcripts or in a Southern blot or 
35 FISH assay to detect amplification of the gene (which correlates directly with AIBl gene 

expression). Alternatively, AIBl gene expression is measured by detecting an AIBl gene product, 

e.g., using an AIBl-specific antibody. 
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Transgenic mammals, e.g., mice, which overexpress an AIBl gene product, e.g., by virtue 
of harboring multiple copies of AIBl -encoding DNA, are also within the invention. 

"Transgenic" as used herein means a mammal which bears a transgene, a DNA sequence 
which is inserted by artifice into an embryo, and which then becomes part of the genome of the 
5 mammal that develops from that embryo. Any non-human mammal which may be produeed by 
transgenic technology is included in the invention; preferred mammals include, mice, rats, cows, 
pigs, sheep, goats, rabbits, guinea pigs, hamsters, and horses. 

By "transgene" is meant DNA which is partly or entirely heterologous (i.e*, foreign) to the 
transgenic manmial, or DNA homologous to an endogenous gene of the transgenic mamm al ^ but 
10 which is inserted into the mammal's genome at a location which differs from that of the natural 
gene. 

Also within the invention is a knockout mutant, for instance a knockout mouse wherein the 
mouse has had at least one copy of the AIBl gene (also called the pCIP gene in mice) deleted from 
its genome. Such a knockout mutant would be useful in research, for instance the phenotype gives 
15 insight into the physiological role of AIBl . Complementation experiments using such a knockout 
mutant can be used to identify other genes and proteins that make up for the lack of AIBl in the 
mutant to restore wild-type phenotype. 

Also within the invention is a mutant, such as a mouse, which contains more than the 
normal number of copies of the AIBl (pCIP) gene, either integrated into a chromosome, for 
20 instance as a pro-virus, or in an extra-chromosomal element, such as on a plasmid. 

Also within the invention is a mutant, for example, a mouse, which contains the AIBl 
(pCIP) gene driven by a non-native promoter, such as a constimtive or an inducible promoter, such 
as the mouse mammary tumor virus (MMTV) promoter. 

The invention also includes methods of treatment for cancers the growth of which involves 
25 alternations of signaling pathways involving p300 and/or CBP. For example, AIBl (pCIP) may be 
contacted with a molecule that binds to AIBl and inhibits AIBTs interaction with p300, thereby 
disrupting signaling of this pathway and reducing transcription of molecules whose transcription is 
positively regulated by this pathway; thereby reducing mmor growth. 

30 Example 1: Cloning and Expression of AIBl 
A. Cloning of AIBl 

Chromosome microdissection and hybrid selection techniques were used to isolate probes 
and clone gene sequences which map to chromosome 20q, one of the recurrent sites of DNA 
amplification in breast cancer cells identified by molecular cytogenetics (Kallioniemi et al., PNAS 

35 91: 2156-2160 (1994); Guan et al.. Nature Genetics 8: 155-161 (1994); Tanner et al., Clin. Cancer 
Res, 1: 1455-1461 (1995); Guan et al.. Cancer Res. 56: 3446-3450 (August 1996); Anzick et ak. 
Science 111: 965-968 (August 1997)). AIBl is a member of the SRC-1 family of nuclear receptor 
(NR) co-activators. AIBl functions to enhance ER-dependent transcription. SRC-1 and the closely 
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related TIF2 are steroid receptor co-activators with an affinity for NRs. The mouse ortholog of 
human AIBl is called pCIP. In this application pCIP and AIBl will be used synonymously imless 
the contrary is clearly expressed. 

To characterize AIBl, the fiill length cDNA was cloned and sequenced. An AIBl specific 
5 primer N8F1 (5*-TCATCACTTCCGACAACAGAGG-3*; SEQ. I.D. NO. 5) was biotinylated and 
used to capture cDNA clones from a human lung cDNA library (Gibco, BRL) using the 
GENETRAPPER cDNA Positive Selection System (Gibco, BRL). The largest clone (5.8 kb), 
designated pCMVSPORT-Bll, was selected for sequence analysis. To obtain full-length AIBl- 
encoding DNA, a random-primed library from BT-474 was constructed in bacteriophage A-Zap 

10 (Stratagene) and hybridized with a 372 bp ^^P-labeled PGR product amplified from a human spleen 
cDNA library using primers designed form the 5' sequence of pCM VSPORT-B 1 1 , PM-U2 (5'- 
CCAGAAACGTCACTATCAAG-3\ forward primer; SEQ. I.D. NO. 6) and Bll-llRA (5 - 
TTACTGGAACCCCCATACC-3', reverse primer; SEQ, I.D. NO. 7). Plasmid rescue of 19 
positive clones yielded a clone, pBluescript-R22, which overlapped pCM VSPORT-B 11 and 

15 contained the 5* end of the coding region. To generate a full length AIBl clone, the 4.85 kb 

Hindlll/Xhol fragment of pCM VSPORT-B 11 was subcloned into Hindlll/Xhol sites of pBluescript- 
R22. The 4,84 kb Notl/Nhel fragment of the full length clone containing the entire coding region 
was then subcloned into the Notl/Xbal sites of the expression vector, pcDNA3.1 (Invitrogen), 
generating pcDNA3 . 1-AIB 1 , 

20 The cloned DNA sequence (SEQ. I.D. No. 1) revealed an open reading frame (beginning at 

the imderlined "ATG") encoding a protein of 1420 amino acids with a predicted molecular weight 
of 155 kDa (Fig. lA). Database searches with BLASTP identified a similarity of AIBl with TIF2 
(45% protein identity) and SRC-1 (33% protein identity). Like TIF2 and SRC-1, AIBl contains a 
bHLH domain preceding a PAS domain, serine/threonine-rich regions, and a charged cluster (Fig. 

25 IB). There is also a glutamine-rich region which, unlike SRC-1 and TIF2, contains a 
polyglutamine tract (Fig. IB). The polyglutamine tract of AIBl is subject to genetic 
polymorphism. Variations in the size of this domain alter AIBl co-activator activity. 

B. Expression of AIBl 

30 Amplification and expression of AIBl in several ER positive and negative breast and ovarian 

cancer cell lines was examined. Established breast cancer cell lines used in the experiments 
described below (see, e.g.. Fig. 2) were obtained from the American Type Culture Collection 
(ATCC): BT-474, MCF-7, T-47D, MDA-MB-361, MDA-MB-468, BT-20, MDA-MB-436, and 
MDA-MB-453; the Arizona Cancer Center (ACC): UACC-812; or the National Cancer Institute 

35 (NCI): ZR75-1. 

AIBl gene copy number was determined by FISH, For FISH analysis, interphase nuclei - 
were fixed in methanol: acetic acid (3:1) and dropped onto microscope slides. AIBl amplification 
was detected in the breast cancer cell line ZR75-1, the ovarian cancer cell line BG-1, and two 
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uncultured breast cancer samples. Intra-chromosomal amplification of AIBl was apparent in 
metaphase chromosomes of ZR75-1 and BGl. Numerous copies of AIBl were resolved in the 
adjacent interphase nuclei. Extrachromosomal copies (e.g., in episomes or double minute 
chromosomes) of AIBl have also been detected. The Spectrum-Orange (Vysis) labeled AIBl PI 
5 probe was hybridized with a biotmylated reference probe for 20ql 1 (RMC20P037) or a fluorescein 
labeled probe for 20p (RMC20C039). 

High level amplification of AIBl (greater than 20 fold), similar to that observed in BT-474 
and MCF-7, was seen in two additional ER-positive cell lines, breast carcinoma ZR75-1, and 
ovarian carcinoma BG-1 (see Fig. 2). Interphase FISH studies demonstrated that amplification of 

10 chromosome 20q in breast cancer is complex, involving several distinct variably co-amplified 

chromosomal segments derived from 20qll, 20ql2, and 20ql3. Probes for the 20qll and 20ql3 
regions of amplification did not detect amplification m ZR75-1 and BG-1. suggesting that 
amplification of AIBl (which maps to 20q 12) occurred independently in these cell lines. 

To determine if AIBl amplification also occurred in uncultured cells from patient biopsies, 

15 breast cancer specimens were screened for AIBl amplification by interphase FISH. In two of 16 
specimens analyzed, high AIBl copy number (up to 25 copies/cell) was detected. Both tumor 
specimens tested came from post-menopausa! patients and were ER/PR positive. One of the 
specimens was obtained from a metastatic tumor of a patient who subsequently responded favorably 
to tamoxifen treatment. 

20 AIBl expression was also examined in cells with and without AIBl amplification and 

compared to expression of ER, SRC-1 and TIF2 by Northern blotting. In accordance with its 
amplification status, AIBl was highly overexpressed in BT-474, MCF-7, ZR75-1, and BG-1 (Fig. 
2). Three of the four cell lines exhibiting AIBl overexpression also demonstrated prominent ER 
expression, while two others displayed lower but detectable ER expression (BT-474 and BT-20). 

25 Fig. 2 also shows that the expression of TIF2 and SRC-1 remained relatively constant in all cell 
lines tested. Taken together, these observations demonstrate that AIBl amplification is associated 
with significant overexpression of AIBl gene product. The correlation of elevated AIBl expression 
with ER positivity in tumors indicates that AIBl is a component of the estrogen signaling pathway, 
the amplification of which is selected during cancer development and progression. 

30 To determine whether expression of AIBl increases ER ligand-dependent transactivation, 

transient transfection assays were performed. The effect of increasing levels of AIBl on 
transcription of an ER dependent reporter was measured. The results demonstrated that co- 
transfection of AIBl led to a dose dependent increase in estrogen-dependent transcription (Fig, 3). 
This effect was not observed when the estrogen antagonist. 4-hydroxy tamoxifen (4-OHT), was 

35 substimted for 17P-estradiol or when the estrogen response element (ERE) was removed from the 
reporter plasmid (Fig. 3). A modest increase in basal transcription levels was observed with higher 
concentrations of AIBl even in the absence of an ERE suggesting that AIBl may have an intrinsic 
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transactivation function. These results demonstrate that, like the closely related TIF2 and SRC-1, 
AIBl functions as an ER co-activator. 

Example 2: Characterization of AIBl 
5 A. Functional Domains of AIBl - 

TIF-2, SRC-1, and AIBl are characterized by highly conserved N-terminal bHLH and PAS 
domains. The PAS region functions as a protein dimerization interface in the mammalian aryl 
hydrocarbon receptor and the aryl hydrocarbon receptor nuclear transporter proteins, as well as the 
Drosophila transcription factors sim and per. The PAS region (SEQ. I.D. NO. 2) of AIBl 

10 functions as a protein interaction domain, mediating binding between AIBl and other proteins. 
However, steroid hormone activators lacking the PAS domain are capable of mteracting with 
nuclear steroid hormone receptors. The highly conserved bHLH domain {SEQ. I.D. NO. 3) 
participates in protein interactions which mediate or modulate transmission of the hormone signal to 
the transcriptional apparatus. The ER-inieracting domain (SEQ. I.D. NO. 8) mediates binding of 

15 AIBl with a steroid hormone receptor protein. 

AIBl also interacts with the transcriptional integrators CREB binding protein (CBP) and 
p300. These transcriptional integrators interact directly with the basal transcriptional machinery. 
The CBP/p300 receptor association domain of AIBl does not encompass the bHLH/PAS regions. 
B. Purification of Gene Products 

20 DNA containing a sequence that encodes part or all of the amino acid sequence of AIBl can 

be subcloned into an expression vector, using a variety of methods known in the art. The 
recombinant protein can then be purified using standard methods. For example, a recombinant 
polypeptide can be expressed as a fusion protein in procaryotic cells such as E. coli. Using the 
maltose binding protein fusion and purification system (New England Biolabs), the cloned human 

25 cDNA sequence is inserted downstream and in frame of the gene encoding maltose binding protein 
(malE). The malE fusion protein is overexpressed in E, coli and can be readily purified in 
quantity. In the absence of convenient restriction sites in the human cDNA sequence, PCR can be 
used to introduce restriction sites compatible with the pMalE vector at the 5' and 3' end of the 
cDNA fragment to facilitate insertion of the cDNA fragment into the vector. Following expression 

30 of the fusion protein, it can be purified by affinity chromatography. For example, the fusion 

protein can be purified by virtue of the ability of the maltose binding protein portion of the fusion 
protein to bind to amylase immobilized on a colunm. 

To facilitate protein pinrification, the pMalE plasmid contains a factor Xa cleavage site 
upstream of the site into which the cDNA is inserted into the vector. Thus, the fusion protein 

35 purified as described above can be cleaved with factor Xa to separate the maltose binding protein 
portion of the fiision protein from recombinant human cDNA gene product. The cleavage products 
can be subjected to further chromatography to piu-ify recombinant polypeptide from the maltose 
binding protein. Alternatively, an antibody specific for the desired recombinant gene product can 
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be used to purify the fusion protein and/or the gene product cleaved from the fusion protein. Many 
comparable commercially available fusion protein expression systems can be utilized similarly. 

AIBl polypeptides can also be expressed in eucaryotic cells, e.g., yeast cells, either alone or 
as a fusion protein. For example, a fusion protein containing the GAL4 DNA-binding domain or 
5 activation domain fiised to a functional domain of AIBl, e.g., the PAS domain, the bHLH-domain, 
or the ER-interacting domain, can be expressed in yeast cells using standard methods such as the 
yeast two hybrid system described below. Alternatively, AIBl polypeptides can be expressed in 
COS-1 cells using methods well known in the art, e.g., by transfecting a DNA encoding an AIBl 
polypeptide into COS-1 cells using, e.g., the Lipofectamine transfection protocol described below, 
10 and culturing the cells imder conditions suitable for protein expression. 

Example 3: Detection of AIBl 

A. Detection of Nucleotides Encoding AIBl 

Determination of gene copy number in cells of a patient-derived sample is known in the art. 

15 For example, AIBl amplification in cancer-derived cell lines as well as uncultured breast cancer 

cells was carried out using bicolor FISH analysis as follows. A genomic PI clone containing AIBl 
was labeled with Spectrum Orange-dUTP (Vysis) using the BioPrime DNA Labeling System 
(Gibco BRL). A 20qll PI clone was labeled with Biotin-16-dUTP (BMB) using nick translation. 
Fluorescent images were captured using a Zeiss axiophot microscope equipped with a CCD camera 

20 and IP Lab Spectrum software (Signal Analytics). Interphase FISH analysis of unculmred breast 
cancer samples was performed using known methods (Kallioniemi et al., PNAS 91: 2156-2160 
(1994); Guan et al.. Nature Genetics 8: 155-161 (1994); Tanner et al., Clin. Cancer Res. 1: 1455- 
1461 (1995); Guan et al.. Cancer Res. 56: 3446-3450 (August 1996); Anzick et al.. Science 277: 
965-968 (August 1997)). Alternatively, standard Southern hybridization techniques can be 

25 employed to evaluate gene amplification. For example. Southern analysis is carried out using a 
non-repetitive fragment of genomic AIBl DNA, e.g., derived from the 20qll PI clone described 
above or another AIBl gene-containing genomic clone, as a probe. 

The level of gene expression may be measured using methods known in the art, e.g., in situ 
hybridization. Northern blot analysis, or Western blot analysis using AIBl -specific monoclonal or 

30 polyclonal antibodies. AIBl gene transcription was measured using Northern analysis. For 

example, the data shown in Fig. 2 was obtained as follows. The blot was hybridized sequentially 
with a probe (ER, AIBl, TIF2, SRC-1, or p-actin as indicated to the left of the photograph). AIBl 
expression was compared to that of ER, TIF2, and SRC-1 . cDNA clones were obtained from 
Research Genetics [TIF2 (clone 132364, GenBank accession no. R25318); SRC-1 (clone 418064, 

35 GenBank accession no. W90426)], the American Type Culture Collection (pHEGO-hyg, ATCC 
number 79995), or Clontech (p actin). The AIBl probe was a 2.2kb Notl/SacI fragment of 
pCMVSPORT-Bll. The p-actin probe was used as a control for loading error. To avoid cross- 
hybridization between these related genes and to match signal intensities, similar sized probes from 
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the 3'UTRs of AIBl, TIF2, and SRC-1 were utilized. Each of these probes detected a signal in 
normal mammary RNA on longer exposure. Electrophoresis, transfer and hybridization of 15 fig 
total RNA was performed by standard methods. 

5 B. Detection of AIBl Gene Products 

AIBl polypeptides to be used as antigens to raise AIBl -specific antibodies can be generated 
by methods known in the art, e.g., proteolytic cleavage, de novo synthesis, or expression of a 
recombinant polypeptide from the cloned AIBl gene or a fragment thereof. AIBl -specific 
antibodies are then produced using standard methodologies for raising polyclonal antisera and 
10 making monoclonal antibody -producing hybridoma cell lines (see Coligan et al., eds., Current 
Protocols in Immunology, 1992, Greene Publishing Associates and Wiley-Interscience). To 
generate monoclonal antibodies, a mouse is immunized with an AIBl polypeptide, antibody- 
secreting B cells isolated from the mouse, and the B cells itmnortalized with a non-secretory 
myeloma cell fusion partner. Hybridomas are then screened for production of an AIBl -specific 
15 antibody and cloned to obtain a homogenous cell population which produces a monoclonal antibody. 

For administration to human patients, antibodies, e.g., AIBl specific monoclonal antibodies, 
can be humanized by methods known in the art. Antibodies with a desired binding specificity can 
be commercially humanized (Scotgene, Scotland; Oxford Molecular, Palo Alto, CA). 

20 Example 4: Detection of AIBl-relate d cell proliferative disorders 
A, Diagnostic and Prognostic Methods 

The invention includes a method of detecting an aberrantly proliferating cell, e.g., a steroid 
hormone-responsive cancer cell such as a breast cancer cell, an ovarian cancer cell, colon cancer 
cell, or prostate cancer cell, by detecting the number of AIBl gene copies in the cell and/or the 

25 level of expression of the AIBl gene product. AIBl gene amplification or gene expression in a 
patient-derived tissue sample is measured as described above and compared to the level of 
amplification or gene expression in normal non-cancerous cells. An increase in the level of 
amplification or gene expression detected in the patient-derived biopsy sample compared to the 
normal control is diagnostic of a diseased state, i.e., the presence of a steroid hormone responsive 

30 cancer. 

Because of the importance of estrogen exposure to mammary carcinogenesis and of anti- 
estrogen treatment in breast cancer therapy, such assays are also useful to determine the frequency 
of alterations of AIBl expression in pre-malignant breast lesions (e.g. ductal carcinoma in situ) and 
during the progression from hormone dependent to hormone independent tumor growth. 
35 The diagnostic methods of the invention are useful to determine the prognosis of a patient 

and estrogen responsive stams of a steroid hormone-responsive cancer. - 

AIBl expression can also be measured at the protein level by detecting an AIBl gene 
products with an AIBl -specific monoclonal or polyclonal antibody preparation. 
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B. Diagnosis of Tamoxifen-Sensitivity 

Overexpression of AIBl, e.g., as a result of AIBl gene amplification, in steroid hormone- 
responsive cancers can predict whether the cancer is treatable with anti-endocrine compositions, 
e.g., tamoxifen. AIBl amplification or overexpression in a patient-derived tissue sample compared 

5 to a normal (non-cancerous) tissue indicates tumor progression. 

Absence of AIBl, e.g., loss of all or part of the AIBl gene, but retention of ER-positivity 
in steroid hormone-responsive cancers predicts failure or poor responsiveness to anti-endocrine 
therapy, e.g., administration of anti-estrogen compositions such as tamoxifen. Since loss of AIBl 
expression in a cancer cell may indicate a disruption of the ER signal transduction pathway, anti- 

10 estrogen therapy may be ineffective to treat such cancers. Patients identified in this manner (who 
would otherwise be treated with anti-estrogens) would be treated with alternative therapies. 

Loss of estrogen receptor in recurrent breast caner is also associated with poor response to 
endocrine therapy. Up to 30% to 40% of metastases from hormone receptor-positive primary 
breast cancer do not respond to endocrine therapy. The frequency of hormone receptor status 

15 changes between primary and recurrent tumors and whether such a change might explain 

imresponsiveness to endocrine therapy was examined. Primary breast cancer samples and matched 
asynchronous recurrences were studied from 50 patients who had not received any adjuvant 
therapy. ER and progesterone receptor (PR) stams was determined immunohistochemically from 
histologically representative formalin-fixed paraffin-embedded tumor samples. ER status was 

20 ascertained by mRNA in situ hybridization. Thirty-five (70%) of 50 primary tumors were positive 
for ER and 30 (60%) for PR. Hormone receptor status of the recurrent mmor differed from that of 
the primary tumor in 18 cases (36%). Discordant cases were due to the loss of ER (n=6), loss of 
PR (n=6), or loss of both receptors (n=6). Receptor-negative primary tumors were always 
accompanied by receptor-negative recurrences. Among 27 patients with ER-positive primary 

25 tumors, loss of ER was a significant predictor (P= .0085) of poor response to subsequent endocrine 
therapy. Only one of eight patients (12.5%) with lost ER expression responded to tamoxifen 
therapy, whereas the response rate was 74% (14 of 19) for patients whose recurrent tumors 
retained ER expression. Loss of ER expression in recurrent breast cancer predicts poor response to 
endocrine therapy in primarily ER-positive patients. Evaluation of ER expression and/or AIBl 

30 expression (or gene copy number) is useful to determine the most effective approach to treatment of 
steroid-responsive cancers. 

Example 5 : Screening of candidate compounds 
A. In vitro assays 

35 The invention includes methods of screening to identify compoimds which inhibit the 

interaction of AIBl with ER, thereby decreasing estrogen dependent transcription which leads to- 
aberrant cell proliferation. A transcription assay is carried out in the presence and absence of the 
candidate compound. A decrease in transcription in the presence of the compound compared to that 
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in its absence indicates that the compound blocks an AIBl/ER interaction and inhibits estrogen 
dependent transcription. 

To determine the effect of AIBl on estrogen-dependent transcription, an ER reporter 
plasmid can be used. The transcription assays described herein were conducted as follows, COS-l 
5 cells were grown and maintained in phenol-red free DMEM medium supplemented with 1&% 
charcoal-stripped fetal bovine serum. Cells were plated into 6-well culture dishes at 1.5 X Itf 
cells/well and allowed to grow overnight. Transfection of cells with the ER reporter plasmid was 
performed with Lipofectamine (Gibco, BRL) following the manufacturer's protocol. Three ng 
pRL-CMV were used as an internal control for transfection efficiency. Ligand or ethanol vehicle 

10 was added 234 hours post-transfection and cell lysates were harvested 48 hours post-transfection. 
Reporter activities were determined using the Dual-Luciferase Reporter Assay System (Promega) 
and the results expressed in relative luminescence units (RLU; luciferase//?e/i/7/a luciferase), pRL- 
CMV and pGL3-promoter were obtained from Promega. pHEGO-hyg was obtained from ATCC. 
The ER reponer pGL3.luc.3ERE contains three tandem copies of the ERE upstream from the SV40 

15 promoter driving the luciferase gene. Standard mammalian expression vectors were utilized. 

Empty pcDNA3 vector was added to each of the pcDNA3. 1-AIBl dilutions to maintain constant 
amounts of plasmid DNA. 

Compounds which inhibit the interaction of AIBl with ER are also identified using a 
standard co-precipitation assay. AIBl/ER co-precipitation assays are carried out as follows. An 

20 AIBl polypeptide and an ER polypeptide are incubated together to allow complex formation. One 
of the polypeptides is typically a fusion protein, e.g., GST- AIBl, and the other is tagged with a 
detectable label, e.g., ^^P-labeled ER). After incubation, the complex is precipitated, e.g., using 
glutathione-Sepharose beads. The beads are washed, filtered through a glass fiber filter, and 
collected. The amount of co-precipitated ^^P-label is measured. A reduction in the amoimt of co- 

25 precipitated label in the presence of a candidate compoimd compared to that in the absence of the 
candidate compound indicates that the compound inhibits an AIBl/ER interaction 

Alternatively, a standard in vitro binding assay can be used. For example, one polypeptide, 
e.g., AIBl, can be boimd to a solid support and contacted with the second polypeptide, e.g., ER. 
The amoimt of the second polypeptide which is retained on the solid support is then measured. A 

30 reduction in the amount of retained (second) polypeptide in the presence of a candidate compound 
compared to that in its absence indicates that the compound inhibits an AIBl/ER interaction. 
Techniques for column chromatography and coprecipitation of polypeptides are well known in the 
art. 

An evaluation of AIBl/ER interaction and identification of compoxmds that blocks or 
35 reduces the interaction can also be carried out in vivo using a yeast two-hybrid expression system in 
which the activity of a transcriptional activator is reconstituted when the two proteins or 
polypeptides of interest closely interact or bind to one another. 
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The yeast GAL4 protein consists of functionally distinguishable domains. One domain is 
responsible for DNA-binding and the other for transcriptional activation. In the two-hybrid 
expression system, plasmids encoding two hybrid proteins, a first fusion protein containing the 
GAL4 DNA-binding domain fused to a first protein, e.g., AIBl, and the second fusion protein 
5 containing the GAL4 activation domain fused to a second protein, e.g., ER, are introduced-into 
yeast. If the two proteins are able to interact with one another, the ability to activate transcription 
from promoters containing Gal4-binding sites upstream from an activating sequence from GALl 
(UASg) is reconstituted leading to the expression of a reporter gene. A reduction in the expression 
of the reporter gene in the presence of a candidate compound compared to that in the absence of the 
10 compound indicates that the compound reduces an AIBl/ER interaction. 

A method of identifying a DNA-binding protein which regulates AIBl transcription can be 
carried out as follows: 

A DNA containing a cis-acting regulatory element can be immobilized on polymeric beads, such as 
agarose or acrylamide. A mixture of proteins, such as a cell lysate, is allowed to come in contact 

15 with and bind to the DNA. Following removal of non-binding proteins, specifically-boimd proteins, 
are eluted with a competing DNA sequence which may be identical to the immobilized sequence. 
Specific binding of a protein to the DNA regulatory element indicates that the protein may regulate 
AIBl transcription. Functional activity of the identified trans-acting factor can be confirmed with 
an appropriate functional assay, such as one which measures the level of transcription of a reporter 

20 gene having the cis-acting regulatory gene 5' to the transcription start site of AIBl . 

A method of identifying a compoimd which decreases the level of AIBl transcription can be 
accomplished by contacting an immobilized AIBl -derived cis-acting regulatory element with a 
trans-acting regulatory factor in the presence and absence of candidate compound. A detectable 
change, i.e., a reduction, in specific binding of the trans-acting factor to its DNA target indicates 

25 that the candidate compound inhibits AIBl transcription. 

In addition to interacting with ER, AIBl also interacts with the transcriptional integrators 
CBP and p300. CBP and p300 participate in the basal transcriptional apparatus in a cell. Thus, 
another approach to inhibit signal transduction through AIBl is to prevent the formation of or 
disrupt an interaction of AIBl with CBP and/or p300, Compoxmds which inhibit signal 

30 transduction (and therefore ceil proliferation) can be identified by contacting AIBl (or a fragment 
thereof which interacts with CBP or p300) with CBP or p300 (or a fragment thereof containing an 
AIBl -interacting domain, e.g., a C-terminal fragment) in the presence and absence of a candidate 
compound. For example, a C-terminal fragment of CBP involved in steroid receptor co-activator 
interaction contains 105 amino acids in the Q-rich region of CBP (Kamei et al., 1996, Cell 85:403- 

35 414; Yao et al., 1996, Proc. Natl. Acad. Sci. USA 93:10626-10631; Hanstein et al., 1996, Proc. 
Natl. Acad. Sci. USA 93:11540-11545). A decrease in AIBl interaction with CBP or p300 in the 
presence of a candidate compoimd compared to that its absence indicates that the compound inhibits 
AIBl interaction with these transcriptional integrators, and as a resuh, AIBl-mediated signal 
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transduction leading to DNA transcription and cell proliferation. Compounds which inhibit AIBl 
interaction with transcriptional integrators can also be identified using a co-precipitation assay and 
the yeast two-hybrid expression system described above. 

5 B, In vivo assays ^ 

Transgenic mice are made by standard methods, e.g., as described in Leder et al., U.S. 
Patent No. 4,736,866, herein incorporated by reference, or Hogan et al., 1986 Manipulating the 
Mouse Embryo. Cold Spring Harbor Laboratory" New York. 

Briefly, a vector containing a promoter operably litiked to AIBl -encoding cDNA is injected 

10 into murine zygotes, e.g., C57BL/6J X DBA/2F2 zygotes. Incorporation of the transgene into 
murine genomic DNA is monitored using methods well known in the an of molecular biology, 
e.g., dot blotting tail DNA with a probe complimentary to the 3* region of the gene contained in 
the AIBl transgene construct. Mice thus confirmed to harbor the transgene can then be used as 
foxmders. Animal lines are created by crossing founders with C57BL/6J mice (The Jackson 

15 Laboratory, Bar Harbor, ME). AIBl transgenic mice can be used to screen candidate compoimds 
in vivo to identify compounds which inhibit aberrant cell proliferation, e.g., as measured by 
reduction tumor growth or metastasis. AIBl transgenic mice are also usefiil to identify other genes 
involved in steroid hormone receptor-dependent cancers and to establish mouse cell lines which 
overexpress AIBl. AIBl-overexpressing cell lines are usefiil to screen for compoimds that 

20 interfere with AIBl function, e.g, by blocking the interaction of AIBl with a ligand. 

Example 6: AIBl therapv 

As discussed above, AIBl is a novel member of the SRC-1 family of transcriptional co- 
activators. Amplification and overexpression of AIBl in ER-positive breast and ovarian cancer 

25 cells and in breast cancer biopsies implicate this protein as a critical component of the estrogen 

response pathway. AIBl overexpression results in increased ER-dependent transcriptional activity 
which confers a growth advantage of AIBl amplification-bearing clones during the development 
and progression of estrogen-dependent cancers. 

Compoimds which inhibit or disrupt the interaction of an AIBl gene product with a steroid 

30 hormone receptor, e.g., ER, are useful as anti-neoplastic agents for the treatment of patients 

suffering from steroid hormone-responsive cancers such as breast cancer, ovarian cancer, prostate 
cancer, and colon cancer. Likewise, compounds which disrupt interaction between AIBl and p3()0 
and/or CBP are also useful as anti-neoplastic agents. 

AIBl polypeptides or peptide mimetics of such polypeptides, e.g., those containing domains 

35 which interact with steroid hormone receptors, can be administered to patients to block the 
interaction of endogenous intracellular AIBl and a steroid hormone receptor, e.g., ER in an 
aberrantly proliferating cell. A mimetic may be made by introducing conservative amino acid 
substitutions into the peptide. Cenain amino acid substitutions are conservative since the old and 
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the new amino acid share a similar hydrophobicity or hydrophylicity or are similarly acidic, basic 
or neutrally charged (Stryer "Biochemistry" 1975, Ch,2, Freeman and Company, New York). 
Conservative substitutions replace one amino acid with another amino acid that is similar in size, 
hydrophobicity, etc. Examples of conservative substitutions are shown in the table below (Table 
5 1). ^ 

TABLE 1 





v-'Unacrvaiivc ouusiiiuiions 


Ala 


ser 


A 

Arg 


lys 


Asn 


gin, his 


Asp 


glu 


Cys 


ser 


Gin 


asn 


Glu 


asp 


Gly 


pro 


His 


asn; gin 


Ue 


leu, val 


Leu 


ile; val 


Lys 


arg; gin; glu 


Met 


leu; ile 


Phe 


met; leu; tyr 


Ser 


thr 


Thr 


ser 


Trp 


tyr 


Tyr 


trp; phe 


Val 


ile; leu 



Variations in the cDNA sequence that result in amino acid changes, whether conservative or 
not, should be minimized in order to preserve the functional and immunologic identity of the 
encoded protein. 



35 Compositions administered therapeutically include polypeptide mimetics in which one or 

more peptide bonds have been replaced with an alternative type of covalent bond which is not 
susceptible to cleavage by peptidases. Where proteolytic degradation of the peptides following 
injection into the subject is a problem, replacement of a particularly sensitive peptide bond with a 
noncleavable peptide mimetic yields a more stable and thus more useful therapeutic polypeptide. 

40 Such mimetics, and methods of incorporating them into polypeptides, are well known in the art. 
Similarly, the replacement of an L-amino acid residue with a D-amino acid residue is a standard 
way of rendering the polypeptide less sensitive to proteolysis. Also useful are amino-terminal 
blockmg groups such as t-butyloxycarbonyl, acetyl, theyl, succinyl. methoxysuccinyl, suberyl, 
adipyl, azelayl. dansyl, benzyloxycarbonyl, fluorenylmethoxycarbonyl, methoxyazelayl, 

45 methoxyadipyl, methoxy suberyl, and 2,4,-dinitrophenyl. 
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AIBl polypeptides or related peptide mimetics may be administered to a patient , 
intravenously in a pharmaceutically acceptable carrier such as physiological saline. Standard 
methods for intracellular delivery of peptides can be used, e.g, packaged in liposomes. Such 
methods are well known to those of ordinary skill in the art. It is expected that an intravenous 
5 dosage of approximately 1 to 100 /xmoles of the polypeptide of the invention would be administered 
per kg of body weight per day. The compositions of the invention are xiseful for parenteral 
administration, such as intravenous, subcutaneous, intramuscular, and intraperitoneal. 

The therapeutic compositions of this invention may also be administered by the use of 
surgical implants which release the compotmds of the invention. These devices could be readily 

10 implanted into the target tissue, e.g., a solid tumor mass, and could be mechanical or passive. 
Mechanical devices, such as pumps, are well known in the art, as are passive devices (e.g., 
consisting of a polymer matrix which contains therapeutic formulations; these polymers may slowly 
dissolve or degrade to release the compound, or may be porous and allow release via pores). 

Antisense therapy in which a DNA sequence complementary to an AIBl mRNA transcript is 

15 either produced in the cell or administered to the cell can be used to decrease AIBl gene expression 
thereby inhibiting undesired cell proliferation, e.g., proliferation of steroid hormone-responsive 
cancer cells. An antisense polynucleotide, i.e., one which is complementary of the coding 
sequence of the AIBl gene, is introduced into the cells in which the gene is overproduced. The 
antisense strand (either RNA or DNA) may be directly introduced into the cells in a form that is 

20 capable of binding to the transcripts. Alternatively, a vector containing a DNA sequence which, 
once within the target cells, is transcribed into the appropriate antisense mRNA, may be 
administered. An antisense nucleic acid which hybridizes to the coding strand of AIBl DNA can 
decrease or inhibit production of an AIBl gene product by associating with the normally single- 
stranded mRNA transcript, and thereby interfering with translation. 

25 DNA is introduced into target cells of the patient with or without a vector or using standard 

vectors and/or gene delivery systems. Suitable gene delivery systems may include liposomes, 
receptor-mediated delivery systems, naked DNA, and viral vectors such as herpes viruses, 
retroviruses, and adenoviruses, among others. The DNA of the invention may be administered in a 
pharmaceutically acceptable carrier. Pharmaceutically acceptable carriers are biologically 

30 compatible vehicles which are suitable for administration to an animal e.g., physiological saline. 
A therapeutically effective amount is an amount of the nucleic acid of the invention which is 
capable of producing a medically desirable result in a patient. As is well known in the medical 
arts, dosage for any given patient depends upon many factors, including the patient's size, body 
surface area, age. the particular compound to be administered, sex, time and route of 

35 administration, general health, and other drugs being administered concurrently. Dosages will 

vary, but a preferred dosage for intravenous administration of a nucleic acid is from approximately 
10* to 10^ copies of the nucleic acid molecule. 
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Determination of optimal dosage is well within the abilities of a pharmacologist of ordinary 

skill. 

Example 7: AIBl Knock out and Overexpression Mouse Mutants 
5 Mutants organism that underexpress or overexpress AIBl are useful for research. Such 

mutants allow insight into the physiological and/or pathological role of AIBl in a healthy and/or 
pathological organism. These mutants are said to be "genetically engineered," meaning that 
information in the form of nucleotides has been transferred into the mutant's genome at a location, 
or in a combination, in which it would not normally exist. Nucleotides transferred in this way are 

10 said to be "non-native." For example, a WAP promoter insened upstream of a native AIBl gene 
would be non-native. An extra copy of a mouse AIBl gene present on a plasmid and transformed 
into a mouse cell would be non-native. Mutants may be, for example, produced from mammals, 
such as mice, that either overexpress AIBl or xmderexpress AIBl or that do not express AIBl at 
all. Overexpression mutants are made by increasing the number of AIBl genes in the organism, or 

15 by introducing an AIBl gene into the organism under the control of a constitutive or inducible or 
viral promoter such as the mouse mammary mmor virus (MMTV) promoter or the whey acidic 
protein (WAP) promoter or the metallothionein promoter. Mutants that underexpress AIBl may be 
made by using an inducible or repressible promoter, or by deleting the AIBl gene, or by destroying 
or limiting the function of the AIBl gene, for instance by disrupting the gene by transposon 

20 insertion. 

Anti-sense genes may be engineered into the organism, imder a constitutive or inducible 
promoter, to decrease or prevent AIBl expression. A gene is said to be "functionally deleted" 
when genetic engineering has been used to negate or reduce gene expression to negligible levels. 
When a mutant is referred to in this application as having the AIBl gene altered or functionally 

25 deleted, this reference refers to the AIBl gene and to any ortholog of this gene, for instance "a 

transgenic animal wherein at least one AIBl gene has been functionally deleted" would encompass 
the mouse ortholog of the AIBl gene. pCIP. When a mutant is referred to as having "more than 
the normal copy number" of a gene, this means that it has more than the usual nimiber of genes 
found in the wild-type organism, eg: in the diploid motise or human. 

30 A mutant mouse overexpressing AIBl may be made by constructing a plasmid having the 

AIBl gene driven by a promoter, such as the mouse mammary tumor virus (MMTV) promoter or 
the whey acidic protem (WAP) promoter. This plasmid may be introduced into mouse oocytes by 
microinjection. The oocytes are implanted into pseudopregnant females, and the litters are assayed 
for insertion of the transgene. Multiple strains containing the transgene are then available for 

35 study. 

WAP is quite specific for mammary gland expression during lactation, and MMTV is 
expressed in a variety of tissues including mammary gland, salivary gland and lymphoid tissues. 
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Many other promoters might be used to achieve various patems of expression, e.g., the 
metallothionein promoter. 

An inducible system may be created in which AIBl is driven by a promoter regulated by an 
agent which can be fed to the mouse such as tetracycline. Such techniques are well known in the 
5 art. 

A mutant knockout mouse from which the AIBl (also called pCIP) gene is deleted was made 
by removing coding regions of the AIBl gene from mouse embryonic stem cells. Fig. 5 shows the 
intron/exon structure for pCIP. Using this table, mutations can be targeted to coding sequences, 
avoiding silent mutations caused by deletion of non-coding sequences. (Fig. 6 shows the 
10 intron/exon structure for the human AIBl gene). These cells were microinjected into mouse 

embryos leading to the deletion of the mouse AIBl gene in the germ line of a transgenic mouse. 
The mouse AIBl gene was mapped and isolated by the following method: The primers AIB/mEST 
Fl 

(5'-TCCTTTTCCCAGCAGCAGTTTG-3'; SEQ.I.D. 10) and AIBl/mESTRl 

15 (5'ATGCCAGACATGGGCATGGG-3' SEQ.I.D.l 1) were used to screen a mouse Bacterial 
Artificial Chromosome (BAG) library and to isolate a mouse BAG (designated 195H10). This 
BAG was assigned to mouse chromosome 2 by fluorescence in situ hybridization (FISH). This 
region is the mouse equivalent of the portion of human chromosome 20 which carries AIBl. 

To map the strucmre of the gene, first the structure of the human AIBl gene was determined 

20 by polymerase chain reaction of a htunan genomic DNA clone containing AIBl using standard 
methods (Genomics 1995 Jan 20; 25(2): 50 1-506) and then the sequences of the intron exon 
boxmdaries were determined (Fig.4). Based on diis information, the corresponding regions of the 
mouse BAG were sequenced. The structure of the mouse gene corresponds closely to that of the 
human gene (Fig. 4). This information localizes the coding regions of the mouse AIBl gene so that 

25 a targeting vector can be constructed to remove these regions from mouse embryonic stem cells. 

These cells can be then injected into mouse embryos leading to deletion of the mouse AIBl gene in 
the germ line of a transgenic mouse. The methods of creating deletion mutations by using a 
targeting vector have been described m Gell ( Thomas and Gapecch, Gell 51(3):503-512, 1987). 
References and patents referred to herein are incorporated by reference. 

30 The above examples are provided by way of illustration only and are in no way intended to 

limit the scope of the invention. One of skill in the art will see that the invention may be modified 
in various ways without departing from the spirit or principle of the invention. We claim all such 
modifications. 
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Sequence Listing 

(1) GENERAL INFORMATION 

(i) APPLICANT: Meltzer and Trent 

(ii) TITLE OF INVENTION: AIBl, A NOVEL RECEPTOR CO- ACTIVATOR 

AMPLIFIED IN CANCER 

(iii) NUMBER OF SEQUENCES: 12 



10 



(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE: Klarquist Sparkman Campbell Leigh & Whinston, LLP 

(B) STREET: One World Trade Center 

121 S.W. Salmon Street, Suite 1600 
15 (C) CITY: Portland 

(D) STATE: Oregon 

(E) COUNTRY: United States of America 

(F) ZIP: 97204-2988 

20 (V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Disk, 3-1/2 inch 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: Widows NT 

(D) SOFTWARE: WordPerfect 7.0 & ASCII 

25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: William D. Noonan, M.D. 

(B) REGISTRATION NUMBER: 30,878 

(C) REFERENCE/DOCKET NUMBER: 4239-49944 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (503) 226-7391 

(B) TELEFAX: (503) 228-9446 

45 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6837 nucleotides; 1419 amino acid residues 

(B) TYPE: Human DNA & Amino Acid 

(C) STRANDEDNESS: Smgle 
50 (D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



30 



35 



40 



CG GCG GCG GCT GCG GCT TAG TCG GTG GCG GCC GGC GGC GGC TGC GGG CTG AGC GGC 
55 1 5 10 15 

GAG TTT CCG ATT TAA AGC TGA GCT GCG AGG AAA ATG GCG GCG GGA GGA TCA AAA TAG 
20 25 30 35 
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TTG CTG GAT GGT 
40 

TGG TTA GCC AGT 

5 60 

CCA CTG GCC AGT 
Pro Leu Ala Ser 
80 

CTT ACC TGC AGT 
10 Leu Thr Cys Ser 

TTG GCT GAG CTG 
Leu Ala Glu Leu 
115 

15 GAT AAA TGT GCG 
Asp Lys Cys Ala 
135 

GGA AAA ACT ATT 
Gly Lys Thr lie 
20 155 

CAG GGA GTT ATT 
Gin Gly Val lie 
175 

TTC CTA TTT GTG 
25 Phe Leu Phe Val 

CAA TAG CTG CAA 
Gin Tyr Leu Gin 
210 

30 CAT GAA GAA GAC 
His Glu Glu Asp 
230 

GTT TCC TGG ACA 
Val Ser Trp Thr 
35 250 

TTG ATG AAA ACA 
Leu Met Lys Thr 
270 

CAG AGA TAT GAA 
40 Gin Arg Tyr Glu 

GAA GGG GAA GAT 
Glu Gly Glu Asp 
305 

45 GAA AGA ACA TTT 

Glu Arg Thr Phe 
325 

AAG GTT GTC AAT 
Lys Val Val Asn 
50 345 

GAT ATA ATC CGA 
Asp lie lie Arg 
365 

TCC CAG AAA CGT 
55 Ser Gin Lys Arg 

TAT CGA TTC TCG 
Tyr Arg Phe Ser 
400 

60 

TTC CGA AAT CCT 
Phe Arg Asn Pro 
420 

AGA GAA CAG AAT 
65 Arg Glu Gin Asn 
440 

CCT ATG GCT GGA 
Pro Met Ala Gly 
460 

70 TTA CAG ATG CCG 
Leu Gin Met Pro 



GGA CTC AGA GAC CAA 
45 

TGC TGA TGT ATA TTC 
65 

GAT TCA CGA AAA CGC 
Asp Ser Arg Lys Arg 
85 

GGT GAA AAA CGG AGA 
Gly Glu Lys Arg Arg 
100 

ATA TCT GCC AAT CTT 
lie Ser Ala Asn Leu 
120 

ATT TTA AAG GAA ACA 
lie Leu Lys Glu Thr 
140 

TCC AAT GAT GAT GAT 
Ser Asn Asp Asp Asp 
160 

GAT AAA GAC TCC TTA 
Asp Lys Asp Ser Leu 
180 

GTG AAT CGA GAC GGA 
Val Asn Arg Asp Gly 
195 

TAT AAG CAA GAG GAC 
Tyr Lys Gin Glu Asp 
215 

AGA AAG GAT TTT CTT 
Arg Lys Asp Phe Leu 
235 

AAT GAG ACC CAA AGA 
Asn Glu Thr Gin Arg 
255 

CCA CAT GAT ATT CTG 
Pro His Asp lie Leu 
275 

ACA ATG CAG TGC TTT 
Thr Met Gin Cys Phe 
290 

TTG CAA TCT TGT ATG 
Leu Gin Ser Cys Met 
310 

CCA TCA AAC CCT GAG 
Pro Ser Asn Pro Glu 
330 

ATA GAT ACA AAT TCA 
lie Asp Thr Asn Ser 
350 

AGG TGT ATT CAG AGA 
Arg Cys lie Gin Arg 
. 370 

CAC TAT CAA GAA GCT 
His Tyr Gin Glu Ala 
385 

TTG GCT GAT GGA ACT 
Leu Ala Asp Gly Thr 
405 

GTA ACA AAT GAT CGA 
Val Thr Asn Asp Arg 
425 

GGA . TAT AGA CCA AAC 
Gly Tyr Arg Pro Asn 
445 

TGC AAC AGT TCG GTA 
Cys Asn Ser Ser Val 
465 

AGC AGC AGG GCC TAT 
Ser Ser Arg Ala Tyr 



TAA AAA TAA ACT GCT 
50 

AAG ATG AGT GGA TTA 
Met Ser Gly Leu 
70 

AAA TTG CCA TGT GAT 
Lys Leu Pro Cys Asp 
90 

CGG GAG CAG GAA AGT 
Arg Glu Gin Glu Ser 
105 

AGT GAT ATT GAC AAT 
Ser Asp lie Asp Asn 
125 

GTA AGA CAG ATA CGT 
Val Arg Gin lie Arg 
145 

GTT CAA AAA GCC GAT 
Val Gin Lys Ala Asp 
165 

GGA CCG CTT TTA CTT 
Gly Pro Leu Leu Leu 
185 

AAC ATT GTA TTT GTA 
Asn lie Val Phe Val 
200 

CTG GTT AAC ACA AGT 
Leu Val Asn Thr Ser 
220 

AAG AAT TTA CCA AAA 
Lys Asn Leu Pro Lys 
240 

CAA AAA AGC CAT ACA 
Gin Lys Ser His Thr 
260 

GAA GAC ATA AAC GCC 
Glu Asp lie Asn Ala 
280 

GCC CTG TCT CAG CCA 
Ala Leu Ser Gin Pro 
295 

ATC TGT GTG GCA CGC 
He Cys Val Ala Arg 
315 

AGC TTT ATT ACC AGA 
Ser Phe He Thr Arg 
335 

CTG AGA TCC TCC ATG 
Leu Arg Ser Ser Met 
355 

TTT TTT AGT CTA AAT 
Phe Phe Ser Leu Asn 
375 

TAT CTT AAT GGC CAT 
Tyr Leu Asn Gly His 
390 

ATA GTG ACT GCA CAG 
He Val Thr Ala Gin 
410 

CAT GGC TTT GTC TCA 
His Gly Phe Val Ser 
430 

CCA AAT CCT GTT GGA 
Pro Asn Pro Val Gly 
450 

GGC GGC ATG AGT ATG 
Gly Gly Met Ser Met 
470 

GGC TTG GCA GAC CCT 
Gly Leu Ala Asp Pro 



TGA 


ACA 


TCC 


TTT 


GAC 






55 






GGA 


GAA 


AAC 


TTG 


GAT 


Gly 


Glu 


Asn 


Leu 


Asp 








75 




ACT 


CCA 


GGA 


CAA 


GGT 


Thr 


Pro 


Gly 


Gin 


Gly 










95 


AAA 


TAT 


ATT 


GAA 


GAA 


Lys 


Tyr 


He 


Glu 


Glu 


110 










TTC 


AAT 


GTC 


AAA 


CCA 


Phe 


Asn 


Val 


Lys 


Pro 




130 








CAA 


ATA 


AAA 


GAG 


CAA 


Gin 


He 


Lys 


Glu 


Gin 






150 






GTA 


TCT 


TCT 


ACA 


GGG 


Val 


Ser 


Ser 


Thr 


Gly 








170 




CAG 


GCA 


TTG 


GAT 


GGT 


Gin 


Ala 


Leu 


Asp 


Gly 










190 


TCA 


GAA 


AAT 


GTC 


ACA 


Ser 


Glu 


Asn 


Val 


Thr 


205 










GTT 


TAC 


AAT 


ATC 


TTA 


Val 


Tyr 


Asn 


He 


Leu 




225 








TCT 


ACA 


GTT 


AAT 


GGA 


Ser 


Thr 


Val 


Asn 


Gly 






245 






TTT 


AAT 


TGC 


CGT 


ATG 


Phe 


Asn 


Cys 


Arg 


Met 








265 




AGT 


CCT 


GAA 


ATG 


CGC 


Ser 


Pro 


Glu 


Met 


Arg 










285 


CGA 


GCT 


ATG 


ATG 


GAG 


Arg 


Ala 


Met 


Met 


Glu 


300 










CGC 


ATT 


ACT 


ACA 


GGA 


Arg 


He 


Thr 


Thr 


Gly 




320 








CAT 


GAT 


CTT 


TCA 


GGA 


His 


Asp 


Leu 


Ser 


Gly 






340 






AGG 


CCT 


GGC 


TTT 


GAA 


Arg 


Pro 


Gly 


Phe 


Glu 








360 




GAT 


GGG 


CAG 


TCA 


TGG 


Asp 


Gly 


Gin 


Ser 


Trp 










380 


GCA 


GAA 


ACC 


CCA 


GTA 


Ala 


Glu 


Thr 


Pro 


Val 


395 










ACA 


AAA 


AGC 


AAA 


CTC 


Thr 


Lys 


Ser 


Lys 


Leu 




415 








ACC 


CAC 


TTC 


CTT 


CAG 


Thr 


His 


Phe 


Leu 


Gin 






435 






CAA 


GGG 


ATT 


AGA 


CCA 


Gin 


Gly 


He 


Arg 


Pro 








455 




TCG 


CCA 


AAC 


CAA 


GGC 


Ser 


Pro 


Asn 


Gin 


Gly 










475 


AGC 


ACC 


ACA 


GGG 


CAG 


Ser 


Thr 


Thr 


Gly 


Gin 
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ATG 


AGT 


GGA 




Met 


Ser 


Gly 




495 






c 

5 


GGC 


ATG 


CAA 




Gly 


Met 


Gin 






515 






ccc 


CCA 


CAT 




Pro 


Pro 


His 


1 n 
lU 






535 




CGT 


AAT 


CGT 




Arg 


Asn 


Arg 




CAC 


TCT 


CCC 


15 


His 


Ser 


Pro 




CTC 


AGT 


GCC 




Leu 


Ser 


Ala 




590 






20 


TCA 


TCA 


CCA 




Ser 


Ser 


Pro 






610 






AAA 


GTA 


AGC 




Lys 


Val 


Ser 


25 






630 




GTG 


GAG 


AGT 




Val 


Glu 


Ser 




AAG 


GAG 


AGC 


3U 


Lys 


Glu 


Ser 




CAT 


AAA 


AAA 




His 


Lys 


Lys 




685 






35 


TTG 


ACC 


AAC 




Leu 


Thr 


Asn 






705 






CCC 


TCT 


GGA 




Pro 


Ser 


Gly 


40 






725 




GGG 


TCA 


CTG 




Gly 


Ser 


Leu 




TCA 


CCA 


GCT 


45 


Ser 


Pro 


Ala 




ATA 


ACT 


TCT 




lie 


Thr 


Ser 




780 






50 


AAG 


GAG 


AAT 




Lys 


Glu 


Asn 






800 






CTC 


TCT 


AAA 


55 


Leu 


Ser 


Lys 








820 




ACC 


AGC 


TCC 




Thr 


Ser 


Ser 


60 


ACA 


AGT 


GAA 




Thr 


Ser 


Glu 




ACT 


AGT 


TCT 




Thr 


Ser 


Ser 


65 


875 








AAG 


CAA 


CAG 




Lys 


Gin 


Gin 






895 






CAG 


TCT 


ATT 


70 


Gin 


Ser 


He 








915 





480 






GCT 


AGG 


TAT 


GGG 


Ala 


Arg 


Tyr 


Gly 






500 




TCA 


CCA 


TCT 


TCC 


Ser 


Pro 


Ser 


Ser 








520 


GGG 


AGT 


CCT 


GGT 


Gly 


Ser 


Pro 


Gly 


GGG 


AGT 


CCA 


AAG 


Gly 


Ser 


Pro 


Lys 


555 








ATG 


GCA 


TCT 


TCT 


Met 


Ala 


Ser 


Ser 




575 






CTG 


CAA 


GCC 


ATC 


Leu 


Gin 


Ala 


He 






595 




GGC 


CCC 


AAA 


TTG 


Gly 


Pro 


Lys 


Leu 








615 


AAT 


CAG 


GAT 


TCC 


Asn 


Gin 


Asp 


Ser 


TCA 


ATG 


TGT 


CAG 


Ser 


Met 


Cys 


Gin 


650 








AGT 


GTT 


GAG 


GGG 


Ser 


Val 


Glu 


Gly 




67 0 






TTA 


CTG 


CAG 


TTA 


Leu 


Leu 


Gin 


Leu 






690 




TCC 


CCC 


CTA 


GAT 


Ser 


Pro 


Leu 


Asp 








710 


GTC 


TCC 


TCC 


TCT 


Val 


Ser 


Ser 


Ser 


TTA 


CAA 


GAG 


AAG 


Leu 


Gin 


Glu 


Lys 


745 








GAG 


GTA 


GCC 


AAG 


Glu 


Val 


Ala 


Lys 




765 






TGT 


GGG 


GAC 


GGA 


Cys 


Gly 


Asp 


Gly 






785 




AAT 


GCA 


CTT 


CTT 


Asn 


Ala 


Leu 


Leu 








oOo 


GAA 


CTA 


CAG 


CCC 


Glu 


Leu 


Gin 


Pro 


ACC 


ATT 


CCT 


AGC 


Thr 


He 


Pro 


Ser 


840 








GAG 


GGA 


TCT 


GGA 


Glu 


Gly 


Ser 


Gly 




0 DU 






GAC 


TTT 


TAC 


AAT 


Asp 


Phe 


Tyr 


Asn 






880 




GTG 


TTT 


CAA 


GGA 


Val 


Phe 


Gin 


Gly 








900 


CGT 


CCT 


CCA 


TAT 


Arg 


Pro 


Pro 


Tyr 







485 




GGT 


TCC 


AGT 


AAC 


Gly 


Ser 


Ser 


Asn 








505 


TAC 


CAG 


AAC 


AAC 


Tyr 


Gin 


Asn 


Asn 


CTT 


GCC 


CCA 


AAC 


Leu 


Ala 


Pro 


Asn 


O40 








ATA 


GCC 


TCA 


CAT 


He 


Ala 


Ser 


His 




560 






GGC 


AAT 


ACT 


GGG 


Gly 


Asn 


Thr 


Gly 






580 




AGT 


GAA 


GGT 


GTG 


Ser 


Glu 


Gly 


Val 








600 


GAT 


AAC 


TCT 


CCC 


Asp 


Asn 


Ser 


Pro 


AAG 


AGT 


CCT 


CTG 


Lys 


Ser 


Pro 


Leu 


635 








TCA 


AAT 


AGC 


AGA 


Ser 


Asn 


Ser 


Arg 




655 






GCA 


GAG 


AAT 


CAA 


Ala 


Glu 


Asn 


Gin 






675 




CTT 


ACC 


TGT 


TCT 


Leu 


Thr 


Cys 


Ser 








695 


TCA 


AGT 


TGT 


AAA 


Ser 


Ser 


Cys 


Lys 


ACA 


TCT 


GGA 


GGA 


Thr 


Ser 


Gly 


Gly 


730 








CAC 


CGG 


ATT 


TTG 


His 


Arg 


He 


Leu 




7 50 






ATT 


ACT 


GCA 


GAA 


He 


Thr 


Ala 


Glu 






770 




AAT 


GTT 


GTC 


AAG 


Asn 


Val 


Val 


Lys 








7 90 


AGA 


TAC 


CTG 


CTG 


Arg 


Tyr 


Leu 


Leu 


CAA 


GTG 


GAA 


GGA 


Gin 


Val 


Glu 


Gly 


825 








TCA 


AGT 


CAA 


GAG 


Ser 


Ser 


Gin 


Glu 




845 






GAC 


TTG 


GAT 


AAT 


Asp 


Leu 


Asp 


Asn 






o oo 




AAT 


TCC 


ATA 


TCC 


Asn 


Ser 


He 


Ser 








885 


ACT 


AAT 


TCT 


CTG 


Thr 


Asn 


Ser 


Leu 


AAC 


CGA 


GCA 


GTG 


Asn 


Arg 


Ala 


Val 



920 



490 



ATA 


GCT 


TCA 


TTG 


He 


Ala 


Ser 


Leu 


AAC 


TAT 


GGG 


CTC 


Asn 


Tyr 


Gly 


Leu 


525 








CAG 


CAG 


AAT 


ATC 


Gin 


Gin 


Asn 


He 




545 






CAG 


TTT 


TCT 


CCT 


Gin 


Phe 


Ser 


Pro 






565 




AAC 


CAC 


AGC 


TTT 


Asn 


His 


Ser 


Phe 








585 


GGG 


ACT 


TCC 


CTT 


Gly 


Thr 


Ser 


Leu 


AAT 


ATG 


AAT 


ATT 


Asn 


Met 


Asn 


He 


620 








GGC 


TTT 


TAT 


TGC 


Gly 


Phe 


Tyr 


Cys 




640 






GAT 


CAC 


CTC 


AGT 


Asp 


His 


Leu 


Ser 






660 




AGG 


GGT 


CCT 


TTG 


Arg 


Gly 


Pro 


Leu 








680 


TCT 


GAT 


GAC 


CGG 


Ser 


Asp 


Asp 


Arg 


GAA 


TCT 


TCT 


GTT 


Glu 


Ser 


Ser 


Val 


715 








GTA 


TCC 


TCT 


ACA 


Val 


Ser 


Ser 


Thr 




735 






CAC 


AAG 


TTG 


CTG 


His 


Lys 


Leu 


Leu 






755 




GCC 


ACT 


GGG 


AAA 


Ala 


Thr 


Gly 


Lys 








775 


CAG 


GAG 


CAG 


CTA 


Gin 


Glu 


Gin 


Leu 


GAC 


AGG 


GAT 


GAT 


Asp 


Arg 


Asp 


Asp 


810 








GTG 


GAT 


AAT 


AAA 


Val 


Asp 


Asn 


Lys 




830 






AAA 


GAC 


CCT 


AAA 


Lys 


Asp 


Pro 


Lys 






850 




CTA 


GAT 


GCT 


ATT 


Leu 


Asp 


Ala 


He 








870 


TCA 


AAT 


GGT 


AGT 


Ser 


Asn 


Gly 


Ser 


GGT 


TTG 


AAA 


AGT 


Gly 


Leu 


Lys 


Ser 


905 








TCT 


CTG 


GAT 


AGC 


Ser 


Leu 


Asp 


Ser 




925 







ACC 


CCT 


GGG 


CCA 


Thr 


Pro 


Gly 


Pro 


510 








AAC 


ATG 


AGT 


AGC 


Asn 


Met 


Ser 


Ser 




530 






ATG 


ATT 


TCT 


CCT 


Met 


He 


Ser 


Pro 






550 




GTT 


GCA 


GGT 


GTG 


Val 


Ala 


Gly 


Val 








570 


TCC 


AGC 


AGC 


TCT 


Ser 


Ser 


Ser 


Ser 


TTA 


TCT 


ACT 


CTG 


Leu 


Ser 


Thr 


Leu 


605 








ACC 


CAA 


CCA 


AGT 


Thr 


Gin 


Pro 


Ser 




625 






GAC 


CAA 


AAT 


CCA 


Asp 


Gin 


Asn 


Pro 






645 




GAC 


AAA 


GAA 


AGT 


Asp 


Lys 


Glu 


Ser 








665 


GAA 


AGC 


AAA 


GGT 


Glu 


Ser 


Lys 


Gly 


GGT 


CAT 


TCC 


TCC 


Gly 


His 


Ser 


Ser 


700 








AGT 


GTC 


ACC 


AGC 


Ser 


Val 


Thr 


Ser 




720 






TCC 


AAT 


ATG 


CAT 


Ser 


Asn 


Met 


His 






740 




CAG 


AAT 


GGG 


AAT 


Gin 


Asn 


Gly Asn 








760 


GAC 


ACC 


AGC 


AGT 


Asp 


Thr 


Ser 


Ser 


AGT 


CCT 


AAG 


AAG 


Ser 


Pro 


Lys 


Lys 


795 








CCT 


AGT 


GAT 


GCA 


Pro 


Ser 


Asp 


Ala 




815 






ATG 


AGT 


CAG 


TGC 


Met 


Ser 


Gin 


Cys 






835 




ATT 


AAG 


ACA 


GAG 


He 


Lys 


Thr 


Glu 








855 


CTT 


GGT 


GAT 


CTG 


Leu 


Gly 


Asp 


Leu 


CAT 


CTG 


GGG 


ACT 


His 


Leu 


Gly 


Thr 


890 








TCA 


CAG 


TCT 


GTG 


Ser 


Gin 


Ser 


Val 




910 






CCT 


GTT 


TCT 


GTT 


Pro 


Val 


Ser 


Val 






930 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



GGC 


TCA 


AGT 


GCT 


CCA 


GTA 


AAA 


AAT 


ATG 


AGT 


GCT 


TTC 


CCG 


ATG 


TTA 


CCA 


AAG 


CAA 


GGC 


Gly 


Ser 


Ser 


Pro 


Pro 


Val 


Lys 


Asn 


He 


Ser 


Ala 


Phe 


Pro 


Met 


Leu 


Pro 


Lys 


Gin 


Pro 








Q "3 Q 

y J D 










940 










945 








950 


ATG 


TTG 


GGT 


GGG 


AAT 


GGA 


AGA 


ATG 


ATG 


GAT 


AGT 


GAG 


GAA 


AAT 


TAT 


GGC 


TCA 


AGT 


ATG 


Met 


Leu 


Gly 


Gly 


Asn 


Pro 


Arg 


Met 


Met 


Asp 


Ser 


Gin 


Glu 


Asn 


Tyr 


Gly 


Ser 


Ser 


Met 










a c c 










960 










965 








GGT 


GGG 


GGA 


AAC 


GGA 


AAT 


GTG 


ACT 


GTG 


AGT 


CAG 


ACT 


CCT 


TCC 


TCA 


GGA 


GAG 


TGG 


GGC 


Gly 


Gly 


Pro 


Asn 


Arg 


Asn 


Val 


Thr 


Val 


Thr 


Gin 


Thr 


Pro 


Ser 


Ser 


Gly 


Asp 


Trp 


Gly 


970 










975 










980 










985 






TTA 


GGA 


AAC 


TCA 


AAG 


GGC 


GGC 


AGA 


ATG 


GAA 


CCT 


ATG 


AAT 


TCA 


AAC 


TCC 


ATG 


GGA 


AGA 


Leu 


Pro 


Asn 


Ser 


Lys 


Ala 


Gly 


Arg 


Met 


Glu 


Pro 


Met 


Asn 


Ser 


Asn 


Ser 


Met 


Gly Arg 




o a A 










yyo 










1000 








1005 




CCA 


GGA 


GGA 


GAT 


TAT 


AAT 


ACT 


TGT 


TTA 


CCG 


AGA 


CCT 


GCA 


CTG 


GGT 


GGC 


TGT 


ATT 


CCG 


Pro 


G±y 


Gly 


Asp 


Tyr 


Asn 


Thr 


Ser 


Leu 


Pro 


Arg 


Pro 


Ala 


Leu 


Gly 


Gly 


Ser 


He 


Pro 






1010 








1015 








1020 






1025 


ACA 


TTG 


GGT 


CTT 


CGG 


TGT 


AAT 


AGG 


ATA 


CCA 


GGT 


GGG 


AGA 


GGA 


GTA 


TTG 


GAA 


CAG 


CAG 


Thr 


Leu 


Pro 


Leu 


Arg 


Ser 


Asn 


Ser 


He 


Pro 


Gly 


Ala 


Arg 


Pro 


Val 


Leu 


Gin 


Gin 


Gin 








1030 








1035 








1040 






1045 


GAG 


GAG 


ATG 


GTT 


CAA 


ATG 


AGG 


GCT 


GGT 


GAA 


ATG 


CCG 


ATG 


GGA 


ATG 


GGG 


GCT 


AAT 


GGC 


Gin 


Gin 


Met 


Leu 


Gin 


Met 


Arg 


Pro 


Gly 


Glu 


He 


Pro 


Met 


Gly 


Met 


Gly 

) 


Ala 


Asn 


Pro 










1050 








1055 








106C 








TAT 


GGG 


CAA 


GGA 


GGA 


GGA 


TGT 


AAC 


CAA 


CTG 


GGT 


TCC 


TGG 


CGG 


GAT 


GGG 


ATG 


TTG 


TCC 


Tyr 


Gly 


Gin 


Ala 


Ala 


Ala 


Ser 


Asn 


Gin 


Leu 


Gly 


Ser 


Trp 


Pro 


Asp 


Gly 


Met 


Leu 


Ser 


1065 








1070 








1075 








1080 






ATG 


GAA 


CAA 


GTT 


TGT 


GAT 


GGC 


AGT 


GAA 


AAT 


AGG 


CCT 


GTT 


CTT 


AGG 


AAT 


TCC 


GTG 


GAT 


Met 


Glu 


Gin 


Val 


Ser 


His 


Gly 


Thr 


Gin 


Asn 


Arg 


Pro 


Leu 


Leu 


Arg 


Asn 


Ser 


Leu 


Asp 




1085 








1090 








1095 






1100 


GAT 


GTT 


GTT 


GGG 


CCA 


CCT 


TCC 


AAC 


CTG 


GAA 


GGC 


CAG 


AGT 


GAG 


GAA 


AGA 


GCA 


TTA 


TTG 


Asp 


Leu 


Val 


Gly 


Pro 


Pro 


Ser 


Asn 


Leu 


Glu 


Gly 


Gin 


Ser 


Asp 


Glu 


Arg 


Ala 


Leu 


Leu 






1105 








1110 








1115 






1120 


GAG 


GAG 


CTG 


GAG 


ACT 


CTT 


CTG 


AGC 


AAC 


ACA 


GAT 


GGC 


AGA 


GGG 


CTG 


GAA 


GAA 


ATT 


GAG 


Asp 


Gin 


Leu 


His 


Thr 


Leu 


Leu 


Ser 


Asn 


Thr 


Asp 


Ala 


Thr 


Gly 


Leu 


Glu 


Glu 


He 


Asp 








1125 








1130 








1135 






1140 


AGA 


GGT 


TTG 


GGG 


ATT 


CCT 


GAA 


CTT 


GTG 


AAT 


CAG 


GGA 


GAG 


GCA 


TTA 


GAG 


CCG 


AAA 


CAG 


Arg 


Ala 


Leu 


Gly 


He 


Pro 


Glu 


Leu 


Val 


Asn 


Gin 


Gly 


Gin 


Ala 


Leu 


Glu 


Pro 


Lys 


Gin 










1145 








1150 








1155 






GAT 


GGT 


TTG 


CAA 


GGC 


GAA 


GAA 


GCA 


GGA 


GTA 


ATG 


ATG 


GAT 


CAG 


AAG 


GCA 


GGA 


TTA 


TAT 


Asp 


Ala 


Phe 


Gin 


Gly 


Gin 


Glu 


Ala 


Ala 


Val 


Met 


Met 


Asp 


Gin 


Lys 


Ala 


Gly 


Leu 


Tyr 


1160 








1165 








1170 








1175 




GGA 


GAG 


AGA 


TAG 


CCA 


GCA 


CAG 


GGG 


CCT 


GCA 


ATG 


CAA 


GGA 


GGC 


TTT 


CAT 


GTT 


CAG 


GGA 


Gly 


Gin 


Thr 


Tyr 


Pro 


Ala 


Gin 


Gly 


Pro 


Pro 


Met 


Gin 


Gly 


Gly 


Phe 


His 


Leu 


Gin 


Gly 




1180 








1185 








1190 








1195 


CAA 


TCA 


CCA 


TGT 


TTT 


AAC 


TGT 


ATG 


ATG 


AAT 


GAG 


ATG 


AAG 


GAG 


GAA 


GGC 


AAT 


TTT 


GGT 


Gin 


Ser 


Pro 


Ser 


Phe 


Asn 


Ser 


Met 


Met 


Asn 


Gin 


Met 


Asn 


Gin 


Gin 


Gly 


Asn 


Phe 


Pro 






1200 








1205 








1210 






1215 


CTG 


GAA 


GGA 


ATG 


CAG 


CCA 


GGA 


GGC 


AAC 


ATG 


ATG 


AGA 


GGC 


CGG 


ACA 


AAC 


ACC 


CCG 


AAG 


Leu 


Gin 


Gly 


Met 


His 


Pro 


Arg 


Ala 


Asn 


He 


Met 


Arg 


Pro 


Arg 


Thr 


Asn 


Thr 


Pro 


Lys 








1220 








1225 








1230 






1235 


GAA 


CTT 


AGA 


ATG 


CAG 


CTT 


CAG 


CAG 


AGG 


CTG 


GAG 


GGC 


GAG 


GAG 


TTT 


TTG 


AAT 


CAG 


AGC 


Gin 


Leu 


Arg 


Met 


Gin 


Leu 


Gin 


Gin 


Arg 


Leu 


Gin 


Gly 


Gin 


Gin 


Phe 


Leu 


Asn 


Gin 


Ser 










1240 








1245 








1250 








GGA 


GAG 


GGA 


CTT 


GAA 


TTG 


AAA 


ATG 


GAA 


AAC 


GCT 


AGT 


GCT 


GGT 


GGT 


GCT 


GGG 


GTG 


ATG 


Arg 


Gin 


Ala 


Leu 


Glu 


Leu 


Lys 


Met 


Glu 


Asn 


Pro 


Thr 


Ala 


Gly 


Gly 


Ala 


Ala 


Val 


Met 


1255 








1260 








1265 




1270 






AGG 


GCT 


ATG 


ATG 


GAG 


CCG 


CAG 


CAG 


GGT 


TTT 


CTT 


AAT 


GCT 


GAA 


ATG 


GTG 


GGC 


CAA 


CGG 


Arg 


Pro 


Met 


Met 


Gin 


Pro 


Gin 


Gin 


Gly 


Phe 


Leu 


Asn 


Ala 


Gin 


Met 


Val 


Ala 


Gin 


Arg 




1275 








1280 








1285 








1290 


AGG 


AGA 


GAG 


CTG 


GTA 


AGT 


CAT 


GAG 


TTG 


GGA 


GAA 


GAG 


AGG 


GTG 


GGT 


ATG 


ATG 


ATG 


GAG 


Ser 


Arg 


Glu 


Leu 


Leu 


Ser 


His 


His 


Phe 


Arg 


Gin 


Gin 


Arg 


Val 


Ala 


Met 


Met 


Met 


Gin 






1295 








1300 








1305 








1310 


GAG 


GAG 


GAG 


GAG 


CAG 


CAA 


CAG 


GAG 


GAG 


CAG 


GAG 


CAG 


CAG 


CAG 


GAG 


GAA 


CAG 


GAA 


CAG 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 








1315 








1320 








1325 






1330 


CAA 


GAG 


GAA 


GAG 


CAG 


CAA 


CAG 


CAG 


CAA 


ACC 


CAG 


GGC 


TTC 


AGG 


CCA 


CCT 


CCT 


AAT 


GTG 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Thr 


Gin 


Ala 


Phe 


Ser 


Pro 


Pro 


Pro 


Asn 


Val 










1335 








1340 








1345 








ACT 


GGT 


TCC 


CCG 


AGG 


ATG 


GAT 


GGG 


GTT 


TTG 


GCA 


GGA 


CCG 


ACA 


ATG 


CCA 


CAA 


GGT 


CCT 


Thr 


Ala 


Ser 


Pro 


Ser 


Met 


Asp 


Gly 


Leu 


Leu 


Ala 


Gly 


Pro 


Thr 


Met 


Pro 


Gin 


Ala 


Pro 


1350 








1355 








1360 








1365 






CCG 


CAA 


GAG 


TTT 


CCA 


TAT 


CAA 


CCA 


AAT 


TAT 


GGA 


ATG 


GGA 


CAA 


CAA 


CCA 


GAT 


CCA 


GGG 
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Pro Gin Gin Phe Pro Tyr Gin Pro Asn Tyr Gly Met Gly Gin Gin Pro Asp Pro Ala 

1370 1375 1380 1385 

TTT GGT CGA GTG TCT AGT CCT CCC AAT GCA ATG ATG TCG TCA AGA ATG GGT CCC TCC 
Phe Gly Arg Val Ser Ser Pro Pro Asn Ala Met Met Ser Ser Arg Met Gly Pro Ser 
5 1390 1395 1400 1405 

CAG AAT CCC ATG ATG CAA CAC CCG CAG GCT GCA TCC ATC TAT CAG TCC TCA GAA ATG 
Gin Asn Pro Met Met Gin His Pro Gin Ala Ala Ser lie Tyr Gin Ser Ser Glu Met 
1410 1415 1420 1425 

AAG GGC TGG CCA TCA GGA AAT TTG GCC AGG AAC AGC TCC TTT TCC CAG CAG CAG TTT 
10 Lys Gly Trp Pro Ser Gly Asn Leu Ala Arg Asn Ser Ser Phe Ser Gin Gin Gin Phe 

1430 1435 1440 

GCC CAC CAG GGG AAT CCT GCA GTG TAT AGT ATG GTG CAC ATG AAT GGC AGC AGT GGT 
Ala His Gin Gly Asn Pro Ala Val Tyr Ser Met Val His Met Asn Gly Ser Ser Gly 
1445 1450 1455 1460 

15 CAC ATG GGA CAG ATG AAC ATG AAC CCC ATG CCC ATG TCT GGC ATG CCT ATG GGT CCT 
His Met Gly Gin Met Asn Met Asn Pro Met Pro Met Ser Gly Met Pro Met Gly Pro 

1465 1470 1475 1480 

GAT CAG AAA TAC TGC TGA CAT CTC TGC ACC AGG ACC TCT TAA GGA AAC CAC TGT ACA 
Asp Gin Lys Tyr Cys *** 
20 1485 1490 1495 1500 

AAT GAC ACT GCA CTA GGA TTA TTG GGA AGG AAT CAT TGT TCC AGG CAT CCA TCT TGG 
1505 1510 1515 1520 

AAG AAA GGA CCA GCT TTG AGC TCC ATC AAG GGT ATT TTA AGT GAT GTC ATT TGA GCA 
1525 1530 1535 

25 GGA CTG GAT TTT AAG CCG AAG GGC AAT ATC TAC GTG TTT TTC CCC CCT CCT TCT GCT 
1540 1545 1550 1555 

GTG TAT CAT GGT GTT CAA AAC AGA AAT GTT TTT TGG CAT TCC ACC TCC TAG GGA TAT 

1560 1565 1570 1575 

AAT TCT GGA GAC ATG GAG TGT TAC TGA TCA TM AAC TTT TGT GTC ACT TTT TTC TGC 
30 1580 1585 1590 1595 

CTT GCT AGC CAA AAT CTC TTA AAT ACA CGT AGG TGG GCC AGA GAA CAT TGG AAG AAT 
1600 1605 1610 1615 

CAA GAG AGA TTA GAA TAT CTG GTT TCT CTA GTT GCA GTA TTG GAC AAA GAG CAT AGT 
1620 1625 1630 

35 CCC AGC CTT CAG GTG TAG TAG TTC TGT GTT GAC CCT TTG TCC AGT GGA ATT GGT GAT 
1635 1640 1645 1650 

TCT GAA TTG TCC TTT ACT AAT GGT GTT GAG TTG CTC TGT CCC TAT TAT TTG CCC TAG 
1655 1660 1665 1670 

40 GCT TTC TCC TAA TGA AGG TTT TCA TTT GCC ATT CAT GTC CTG TAA TAC TTC ACC TCC 
1675 1680 1685 1690 

AGG AAC TGT CAT GGA TGT CCA AAT GGC TTT GCA GAA AGG AAA TGA GAT GAC AGT ATT 
1695 1700 1705 1710 

TAA TCG CAG CAG TAG CAA ACT TTT CAC ATG CTA ATG TGC AGC TGA GTG CAC TTT ATT 
45 1715 1720 1725 

TAA AAA GAA TGG ATA AAT GCA ATA TTC TTG AGG TCT TGA GGG AAT AGT GAA ACA CAT 
1730 1735 1740 1745 

TCC TGG TTT TTG CCT ACA CTT ACG TGT TAG ACA AGA ACT ATG ATT TTT TTT TTA AAG 
1750 1755 1760 1765 

50 TAC TGG TGT CAC CCT TTG CCT ATA TGG TAG AGC AAT AAT GCT TTT TAA AAA TAA ACT 
1770 1775 1780 1785 

TCT GAA AAC CCA AGG CCA GGT ACT GCA TTC TGA ATC AGA ATC TCG CAG TGT TTC TGT 
1790 1795 1800 1805 

GAA TAG ATT TTT TTG TAA ATA TGA CCT TTA AGA TAT TGT ATT ATG TAA AAT ATG TAT 
55 1810 1815 1820 

ATA CCT TTT TTT GTA GGT CAC AAC AAC TCA TTT TTA CAG AGT TTG TGA AGC TAA ATA 
.1825 1830 1835 1840 

TTT AAC ATT GTT GAT TTC AGT AAG CTG TGT GGT GAG GCT ACC AGT GGA AGA GAC ATC 
1845 1850 1855 1860 

60 CCT TGA CTT TTG TGG CCT GGG GGA GGG GTA GTG CTC CAC AGC TTT TCC TTC CCC ACC 
1865 1870 1875 1880 

CCC CAG CCT TAG ATG CCT CGC TCT TTT CAA TCT CTT AAT CTA AAT GCT TTT TAA AGA 
1885 1890 1895 1900 

GAT TAT TTG TTT AGA TGT AGG CAT TTT AAT TTT TTA AAA ATT CCT CTA CCA GAA CTA 
65 1905 1910 1915 

AGC ACT TTG TTA ATT TGG GGG GAA AGA ATA GAT ATG GGG AAA TAA ACT TAA AAA AAA 
1920 1925 1930 1935 

ATC AGG AAT TTA AAA AAA CGA GCA ATT TGA AGA GAA TCT TTT GGA TTT TAA GCA GTC 
1940 1945 1950 1955 

70 CGA AAT AAT AGC AAT TCA TGG GCT GTG TGT GTG TGT GTA TGT GTG TGT GTG TGT GTG 
1960 1965 1970 1975 
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TAT GTT TAA TTA TGT TAG CTT TTC ATC CCC TTT AGG AGC GTT TTC AGA TTT TGG TTG 
1980 1985 1990 1995 

CTA AGA CCT GAA TCC CAT ATT GAG ATC TCG AGT AGA ATC CTT GGT GTG GTT TCT GGT 
2000 2005 2010 

5 GTC TGC TCA GCT GTC CCC TCA TTC TAC TAA TGT GAT GCT TTC ATT ATG TCC CTG TGG 

2015 2020 2025 2030 

ATT AGA ATA GTG TCA GTT ATT TCT TAA GTA ACT CAG TAC CCA GAA CAG CCA GTT TTA 

2035 2040 2045 2050 

CTG TGA TTC AGA GCC ACA GTC TAA CTG AGC ACC TTT TAA ACC CCT CCC TCT TCT GCC 
10 2055 2060 2065 2070 

CCC TAC CAC TTT TCT GCT GTT GCC TCT CTT TGA CAC CTG TTT TAG TCA GTT GGG AGG 
2075 2080 2085 2090 

AAG GGA AAA ATC AAG TTT AAT TCC CTT TAT CTG GGT TAA TTC ATT TGG TTC AAA TAG 
2095 2100 2105 

15 TTG ACG GAA TTG GGT TTC TGA ATG TCT GTG AAT TTC AGA GGT CTC TGC TAG CCT TGG 
2110 2115 2120 2125 

TAT CAT TTT CTA GCA ATA ACT GAG AGC CAG TTA ATT TTA AGA ATT TCA CAC ATT TAG 

2130 2135 2140 2145 

CCA ATC TTT CTA GAT GTC TCT GAA GGT AAG ATC ATT TAA TAT CTT TGA TAT GCT TAC 
20 2150 2155 2160 2165 

GAG TAA GTG AAT CCT GAT TAT TTC CAG ACC CAC CAC CAG AGT GGA TCT TAT TTT CAA 
2170 2175 2180 2185 

AGC AGT ATA GAC AAT TAT GAG TTT GCC CTC TTT CCC CTA CCA AGT TCA AAA TAT ATC 
2190 2195 2200 

25 TAA GAA AGA TTG TAA ATC CGA AAA CTT CCA TTG TAG TGG CCT GTG CTT TTC AGA TAG 

2205 2210 2215 2220 

TAT ACT CTC CTG TTT GGA GAC AGA GGA AGA ACC AGG TCA GTC TGT CTC TTT TTC AGC 

2225 2230 2235 2240 

TCA ATT GTA TCT GAC CCT TCT TTA AGT TAT GTG TGT GGG GAG AAA TAG AAT GGT GCT 
30 2245 2250 2255 2260 

CTT ATC TTT CTT GAC TTT AAA AAA ATT ATT AAA AAC AAA AAA AAA AAA AAA AA 
2265 2270 2275 



(2) INFORMATION FOR SEQ ID NO: 2: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 186 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 





Leu 


Leu 


Gin 


Ala 


Leu 


Asp 


Gly 


Phe 




1 








5 










Phe 


Val 


Ser 


Glu 


Asn 


Val 


Thr 


Gin 


45 


20 










25 








Thr 


Ser 


Val 
40 


Tyr 


Asn 


lie 


Leu 


His 
45 




Pro 


Lys 


Ser 


Thr 
60 


Val 


Asn 


Gly 


Val 


50 


His 


Thr 


Phe 


Asn 


Cys 
80 


Arg 


Met 


Leu 




Asn 


Ala 


Ser 


Pro 


Glu 


Met 


Arg 


Gin 




95 










100 








Gin 


Pro 


Arg 


Ala 


Met 


Met 


Glu 


Glu 


55 




115 










120 






Ala 


Arg 


Arg 
135 


lie 


Thr 


Thr 


Gly 


Glu 
140 




Thr 


Arg 


His 


Asp 
155 


Leu 


Ser 


Gly 


Lys 


60 


, Ser 


Met 


Arg 


Pro 


Gly 
175 


Phe 


Glu 


Asp 



Leu 


Phe 


Val 


Val' 


Asn 


Arg 


Asp 


Gly 


Asn 


He 


Val 




10 










15 










Tyr 


Leu 


Gin 


Tyr 


Lys 


Gin 


Glu 


Asp 


Leu 


Val 


Asn 






30 










35 








Glu 


Glu 


Asp 


Arg 


Lys 


Asp 


Phe 


Leu 


Lys 


Asn 


Leu 










50 










55 




Ser 


Trp 


Thr 


Asn 


Glu 


Thr 


Gin 


Arg 


Gin 


Lys 


Ser 


65 










70 










75 


Met 


Lys 


Thr 


Pro 


His 


Asp 


He 


Leu 


Glu 


Asp 


He 




85 










90 










Arg 


Tyr 


Glu 


Thr 


Met 


Gin 


Cys 


Phe 


Ala 


Leu 


Ser 






105 










110 








Gly 


Glu 


Asp 


Leu 


Gin 


Ser 


Cys 


Met 


He 


Cys 


Val 








125 










130 






Arg 


Thr 


Phe 


Pro 


Ser 


Asn 


Pro 


Glu 


Ser 


Phe 


He 










145 










150 




Val 


^Val 


Asn 


He 


Asp 


Thr 


Asn 


Ser 


Leu 


Arg 


Ser 


160 










165 








170 


He 


He 


Arg 


Arg 


Cys 


He 


Gin 











180 185 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 
65 (A) LENGTH: 73 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 
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(D) TOPOLOGY: Linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Arg Lys Arg Lys Leu Pro Cys Asp Thr Pro Gly Gin Gly Leu Thr Cys Ser Gly Glu 
5 1 5 10 15 

Lys Arg Arg Arg Glu Gin Glu Ser Lys Tyr He Glu Glu Leu Ala Glu Leu He Ser 
20 25 130 135 

Ala Asn Leu Ser Asp. He Asp Asn Phe Asn Val Lys Pro Asp Lys Cys Ala He Leu 
140 ' 145 150 155 

10 Lys Glu Thr Val Arg Gin He Arg Gin He Lys Glu Gin Gly Lys Thr 
160 165 170 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 1419 

(B) TYPE: human amino acid of AIBl 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Met 


Ser 


Gly 


Leu 


Gly 


Glu 


Asn 


Leu 


Asp 


Pro 


Leu 


Ala 


Ser 


Asp 


Ser 


Arg 


Lys 


Arg 


Lys 


1 








5 










10 










15 






Leu 


Pro 


Cys 


Asp 


Thr 


Pro 


Gly 


Gin 


Gly 


Leu 


Thr 


Cys 


Ser 


Gly 


Glu 


Lys 


Arg 


Arg 


Arg 


20 










25 










30 










35 


Glu 


Gin 
40 


Glu 


Ser 


Lys 


Tyr 


He 
45 


Glu 


Glu 


Leu 


Ala 


Glu 
50 


Leu 


He 


Ser 


Ala 


Asn 
55 


Leu 


Ser 


Asp 


He 


Asp 


Asn 


Phe 


Asn 


Val 


Lys 


Pro 


Asp 


Lys 


Cys 


Ala 


He 


Leu 


Lys 


Glu 


Thr 


Val 






60 










65 










70 








75 




Arg 


Gin 


He 


Arg 


Gin 


He 


Lys 


Glu 


Gin 


Gly 


Lys 


Thr 


He 


Ser 


Asn 


Asp 


Asp 


Asp 


Val 








80 










85 










90 






95 


Gin 


Lys 


Ala 


Asp 


Val 


Ser 


Ser 


Thr 


Gly 


Gin 


Gly 


Val 


He 


Asp 


Lys 


Asp 


Ser 


Leu 


Gly 










100 










105 










110 








Pro 


Leu 


Leu 


Leu 


Gin 


Ala 


Leu 


Asp 


Gly 


Phe 


Leu 


Phe 


Val 


Val 


Asn 


Arg 


Asp 


Gly 


Asn 


115 










120 










125 










130 




He 


Val 


Phe 


Val 


Ser 


Glu 


Asn 


Val 


Thr 


Gin 


Tyr 


Leu 


Gin 


Tyr 


Lys 


Gin 


Glu 


Asp 


Leu 




135 










140 










145 










150 




Val 


Asn 


Thr 


Ser 


Val 


Tyr 


Asn 


He 


Leu 


His 


Glu 


Glu 


Asp 


Arg 


Lys 


Asp 


Phe 


Leu 


Lys 






155 










160 










165 










170 


Asn 


Leu 


Pro 


Lys 


Ser 


Thr 


Val 


Asn 


Gly 


Val 


Ser 


Trp 


Thr 


Asn 


Glu 


Thr 


Gin 


Arg 


Gin 








175 










180 










185 








190 


Lys 


Ser 


His 


Thr 


Phe 


Asn 


Cys 


Arg 


Met 


Leu 


Met 


Lys 


Thr 


Pro 


His 


Asp 


He 


Leu 


Glu 










195 










200 










205 








Asp 


He 


Asn 


Ala 


Ser 


Pro 


Glu 


Met 


Arg 


Gin 


Arg 


Tyr 


Glu 


Thr 


Met 


Gin 


Cys 


Phe 


Ala 


210 










215 










220 










225 






Leu 


Ser 
230 


Gin 


Pro 


Arg 


Ala 


Met 
235 


Met 


Glu 


Glu 


Gly 


Glu 
240 


Asp 


Leu 


Gin 


Ser 


Cys 
245 


Met 


He 


Cys 


Val 


Ala 
250 


Arg 


Arg 


He 


Thr 


Thr 
255 


Gly 


Glu 


Arg 


Thr 


Phe 
260 


Pro 


Ser 


Asn 


Pro 


Glu 
265 


Ser 


Phe 


He 


Thr 


Arg 
270 


His 


Asp 


Leu 


Ser 


Gly 
275 


Lys 


Val 


Val 


Asn 


He 
280 


Asp 


Thr 


Asn 


Ser 


Leu 
285 


Arg 


Ser 


Ser 


Met 


Arg 


Pro 


Gly 


Phe 


Glu 


Asp 


He 


He 


Arg 


Arg 


Cys 


He 


Gin 


Arg 


Phe 










290 










295 










300 








Phe 


Ser 


Leu 


Asn 


Asp 


Gly 


Gin 


Ser 


Trp 


Ser 


Gin 


Lys 


Arg 


His 


Tyr 


Gin 


Glu 


Ala 


Tyr 


305 










310 










315 








320 






Leu 


Asn 


Gly 


His 


Ala 


Glu 


Thr 


Pro 


Val 


Tyr 


Arg 


Phe 


Ser 


Leu 


Ala 


Asp 


Gly 


Thr 


He 




325 










330 










335 








340 






Val 


Thr 


Ala 


Gin 


Thr 


Lys 


Ser 


Lys 


Leu 


Phe 


Arg 


Asn 


Pro 


Val 


Thr 


Asn 


Asp 


Arg 


His 






34 5 










350 










355 








360 




Gly 


Phe 


Val 


Ser 


Thr 


His 


Phe 


Leu 


Gin 


Arg 


Glu 


Gin 


Asn 


Gly 


Tyr 


Arg 


Pro 


Asn 


Pro 








365 










370 










375 






380 


Asn 


Pro 


Val 


Gly 


Gin 


Gly 


He 


Arg 


Pro 


Pro 


Met 


Ala 


Gly 


Cys 


Asn 


Ser 


Ser 


Val 


Gly 










385 










390 










395 








Gly Met 


Ser 


Met 


Ser 


Pro 


Asn 


Gin 


Gly 


Leu 


Gin 


Met 


Pro 


Ser 


Ser 


Arg 


Ala 


Tyr 


Gly 


400 










405 










410 










415 






Leu 


Ala 


Asp 


Pro 


Ser 


Thr 


Thr 


Gly 


Gin 


Met 


Ser 


Gly 


Ala 


Arg 


Tyr 


Gly 


Gly 


Ser 


Ser 




420 










425 










430 








435 






Asn 


He 


Ala 


Ser 


Leu 


Thr 


Pro 


Gly 


Pro 


Gly 


Met 


Gin 


Ser 


Pro 


Ser 


Ser 


Tyr 


Gin 


Asn 






440 










445 










450 








455 
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Asn Asn Tyr Gly Leu Asn Met Ser Ser Pro Pro His Gly Ser Pro Gly Leu Ala Pro 
460 465 470 475 

Asn Gin Gin Asn lie Met lie Ser Pro Arg Asn Arg Gly Ser Pro Lys lie Ala Ser 
480 485 490 

5 His Gin Phe Ser Pro Val Ala Gly Val His Ser Pro Met Ala Ser Ser Gly Asn Thr 
495 500 505 510 

Gly Asn His Ser Phe Ser Ser Ser Ser Leu Ser Ala Leu Gin Ala lie Ser Glu Gly 

515 520 525 530 

Val Gly Thr Ser Leu Leu Ser Thr Leu Ser Ser Pro Gly Pro Lys Leu Asp Asn Ser 
10 535 540 545 550 

Pro Asn Met Asn lie Thr Gin Pro Ser Lys Val Ser Asn Gin Asp Ser Lys Ser Pro 
555 560 565 570 

Leu Gly Phe Tyr Cys Asp Gin Asn Pro Val Glu Ser Ser Met Cys Gin Ser Asn Ser 
575 580 585 

15 Arg Asp His Leu Ser Asp Lys Glu Ser Lys Glu Ser Ser Val Glu Gly Ala Glu Asn 
590 595 600 605 

Gin Arg Gly Pro Leu Glu Ser Lys Gly His Lys Lys Leu Leu Gin Leu Leu Thr Cys 

610 615 620 625 

Ser Ser Asp Asp Arg Gly. His Ser Ser Leu Thr Asn Ser Pro Leu Asp Ser Ser Cys 
20 630 635 640 645 

Lys Glu Ser Ser Val Ser Val Thr Ser Pro Ser Gly Val Ser Ser Ser Thr Ser Gly 
650 655 660 665 

Gly Val Ser Ser Thr Ser Asn Met His Gly Ser Leu Leu Gin Glu Lys His Arg lie 
670 675 680 

25 Leu His Lys Leu Leu Gin Asn Gly Asn Ser Pro Ala Glu Val Ala Lys lie Thr Ala 
685 690 695 700 

Glu Ala Thr Gly Lys Asp Thr Ser Ser lie Thr Ser Cys Gly Asp Gly Asn Val Val 

705 710 715 720 

Lys Gin Glu Gin Leu Ser Pro Lys Lys Lys Glu Asn Asn Ala Leu Leu Arg Tyr Leu 
30 725 730 735 740 

Leu Asp Arg Asp Asp Pro Ser Asp Ala Leu Ser Lys Glu Leu Gin Pro Gin Val Glu 
745 750 755 760 

Gly Val Asp Asn Lys Met Ser Gin Cys Thr Ser Ser Thr lie Pro Ser Ser Ser Gin 

765 770 775 

Glu Lys Asp Pro Lys lie Lys Thr Glu Thr Ser Glu Glu Gly Ser Gly Asp Leu Asp 
780 785 790 795 

Asn Leu Asp Ala lie Leu Gly Asp Leu Thr Ser Ser Asp Phe Tyr Asn Asn Ser lie 

800 805 810 815 

Ser Ser Asn Gly Ser His Leu Gly Thr Lys Gin Gin Val Phe Gin Gly Thr Asn Ser 
40 820 825 830 835 

Leu Gly Leu Lys Ser Ser Gin Ser Val Gin Ser lie Arg Pro Pro Tyr Asn Arg Ala 
840 845 850 855 

Val Ser Leu Asp Ser Pro Val Ser Val Gly Ser Ser Pro Pro Val Lys Asn lie Ser 

860 865 870 

Ala Phe Pro Met Leu Pro Lys Gin Pro Met Leu Gly Gly Asn Pro Arg Met Met Asp 
875 880 885 890 

Ser Gin Glu Asn Tyr Gly Ser Ser Met Gly Gly Pro Asn Arg Asn Val Thr Val Thr 

895 900 905 910 

Gin Thr Pro Ser Ser Gly Asp Trp Gly Leu Pro Asn Ser Lys Ala Gly Arg Met Glu 
50 915 920 925 930 

Pro Met Asn Ser Asn Ser Met Gly Arg Pro Gly Gly Asp Tyr Asn Thr Ser Leu Pro 
935 940 945 950 

Arg Pro Ala Leu Gly Gly Ser lie Pro Thr Leu Pro Leu Arg Ser Asn Ser lie Pro 
955 960 965 

55 Gly Ala Arg Pro Val Leu Gin Gin Gin Gin Gin Met Leu Gin Met Arg Pro Gly Glu 
970 975 980 985 

He Pro Met Gly Met Gly Ala Asn Pro Tyr Gly Gin Ala Ala Ala Ser Asn Gin Leu 

990 995 1000 1005 

Gly Ser Trp Pro Asp Gly Met Leu Ser Met Glu Gin Val Ser His Gly Thr Gin Asn 
60 1010 1015 1020 1025 

Arg Pro Leu Leu Arg Asn Ser Leu Asp Asp Leu Val Gly Pro Pro Ser Asn Leu Glu 
1030 1035 1040 

1045 

Gly Gin Ser Asp Glu Arg Ala Leu Leu Asp Gin Leu His Thr Leu Leu Ser Ash Thr 
65 1050 1055 1060 

Asp Ala Thr Gly Leu Glu Glu He Asp Arg Ala Leu Gly He Pro Glu Leu Val Asn 

1065 1070 1075 1080 

Gin Gly Gin Ala Leu Glu Pro Lys Gin Asp Ala Phe Gin Gly Gin Glu Ala Ala Val 

1085 1090 _ 1095 HOO 

Met Met Asp Gin Lys Ala Gly Leu Tyr Gly Gin Thr Tyr Pro Ala Gin Gly Pro Pro. 
1105 1110 1115 1120 



45 



70 
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Met Gin Gly Gly Phe His Leu Gin Gly Gin Ser Pro Ser Phe Asn Ser Met Met Asn 
1125 1130 1135 

1140 

Gin Met Asn Gin Gin Gly Asn Phe Pro Leu Gin Gly Met His Pro Arg Ala Asn lie 
5 1145 1150 1155 

Met Arg Pro Arg Thr Asn Thr Pro Lys Gin Leu Arg Met Gin Leu Gin Gin Arg Leu 
1160 1165 1170 1175 

Gin Gly Gin Gin Phe Leu Asn Gin Ser Arg Gin Ala Leu Glu Leu Lys Met Glu Asn 
1180 1185 1190 1195 - 

10 Pro Thr Ala Gly Gly Ala Ala Val Met Arg Pro Met Met Gin Pro Gin Gin Gly Phe 
1200 1205 1210 1215 

Leu Asn Ala Gin Met Val Ala Gin Arg Ser Arg Glu Leu Leu Ser His His Phe Arg 
1220 1225 1230 

1235 

15 Gin Gin Arg Val Ala Met Met Met Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 

1240 1245 1250 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Thr 
1255 1260 1265 1270 

Gin Ala Phe Ser Pro Pro Pro Asn Val Thr Ala Ser Pro Ser Met Asp Gly Leu Leu 
20 1275 1280 1285 1290 

Ala Gly Pro Thr Met Pro Gin Ala Pro Pro Gin Gin Phe Pro Tyr Gin Pro Asn Tyr 

1295 1300 1305 1310 

Gly Met Gly Gin Gin Pro Asp Pro Ala Phe Gly Arg Val Ser Ser Pro Pro Asn Ala 
1315 1320 1325 1330 

25 Met Met Ser Ser Arg Met Gly Pro Ser Gin Asn Pro Met Met Gin His Pro Gin Ala 

1335 1340 1345 

Ala Ser lie Tyr Gin Ser Ser Glu Met Lys Gly Trp Pro Ser Gly Asn Leu Ala Arg 
1350 1355 1360 1365 

Asn Ser Ser Phe Ser Gin Gin Gin Phe Ala His Gin Gly Asn Pro Ala Val Tyr Ser 
30 1370 1375 1380 1385 

Met Val His Met Asn Gly Ser Ser Gly His Met Gly Gin Met Asn Met Asn Pro Met 

1390 1395 1400 1405 

Pro Met Ser Gly Met Pro Met Gly Pro Asp Gin Lys Tyr Cys *** 
1410 1415 1420 

35 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleotides 

40 (C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 



45 



55 



5 • -TC ATC ACTTCCGAC AACAGAGG-3 ' 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 

(B) TYPE: nucleotides 

50 (C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 



5 ' -CCAGAAACGTCACTATCAAG-3 ' 



(2) INFORMATION FOR SEQ ID NO : 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 

(B) TYPE : nucleotides 

60 (C) STRANDEDNESS: Single 

(D) TOPOLOGY; Linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



5 ' -TTACTGGAACCCCCATACC-3 * 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 950 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Cys 


lie 


Gin 


Arg 


Phe 


Phe 


Ser 


Leu 


Asn 


Asp 


Gly 


Gin 


Ser 


Trp 


Ser 


Gin 


Lys 


Arg 


His 


1 








5 










10 










15 






Tyr 


Gin 


Glu 


Ala 


Tyr 


Leu 


Asn 


Gly 


His 


Ala 


Glu 


Thr 


Pro 


Val 


Tyr 


Arg 


Phe 


Ser 


Leu 


20 










25 










30 








35 








Ala 


Asp 


Gly 


Thr 


He 


Val 


Thr 


Ala 


Gin 


Thr 


Lys 


Ser 


Lys 


Leu 


Phe 


Arg 


Asn 


Pro 


Val 




40 










45 










50 








55 






Thr 


Asn 


Asp 


Arg 


His 


Gly 


Phe 


Val 


Ser 


Thr 


His 


Phe 


Leu 


Gin 


Arg 


Glu 


Gin 


Asn 


Gly 






60 










65 










70 








75 


Tyr 


Arg 


Pro 


Asn 


Pro 


Asn 


Pro 


Val 


Gly 


Gin 


Gly 


He 


Arg 


Pro 


Pro 


Met 


Ala 


Gly 


Cys 








80 










85 










90 








95 


Asn 


Ser 


Ser 


Val 


Gly 


Gly 


Met 


Ser 


Met 


Ser 


Pro 


Asn 


Gin 


Gly 


Leu 


Gin 


Met 


Pro 


Ser 










100 










105 








110 










Ser 


Arg 


Ala 


Tyr 


Gly 


Leu 


Ala 


Asp 


Pro 


Ser 


Thr 


Thr 


Gly 


Gin 


Met 


Ser 


Gly 


Ala 


Arg 


115 










120 










125 










130 




Tyr 


Gly 


Gly 


Ser 


Ser 


Asn 


He 


Ala 


Ser 


Leu 


Thr 


Pro 


Gly 


Pro 


Gly 


Met 


Gin 


Ser 


Pro 




135 










140 










145 








150 






Ser 


Ser 


Tyr 


Gin 


Asn 


Asn 


Asn 


Tyr 


Gly 


Leu 


Asn 


Met 


Ser 


Ser 


Pro 


Pro 


His 


Gly 


Ser 






155 










160 










165 










170 




Pro 


Gly 


Leu 


Ala 


Pro 


Asn 


Gin 


Gin 


Asn 


He 


Met 


He 


Ser 


Pro 


Arg 


Asn 


Arg 


Gly 


Ser 








175 










180 










185 




190 


Pro 


Lys 


lie 


Ala 


Ser 


His 


Gin 


Phe 


Ser 


Pro 


Val 


Ala 


Gly 


Val 


His 


Ser 


Pro 


Met 


Ala 










195 










200 










205 










Ser 


Ser 


Gly 


Asn 


Thr 


Gly 


Asn 


His 


Ser 


Phe 


Ser 


Ser 


Ser 


Ser 


Leu 


Ser 


Ala 


Leu 


Gin 


210 










215 










220 










225 








Ala 


lie 


Ser 


Glu 


Gly 


Val 


Gly 


Thr 


Ser 


Leu 


Leu 


Ser 


Thr 


Leu 


Ser 


Ser 


Pro 


Gly 


Pro 




230 










235 










240 










245 




Lys 


Leu 


Asp 


Asn 


Ser 


Pro 


Asn 


Met 


Asn 


He 


Thr 


Gin 


Pro 


Ser 


Lys 


Val 


Ser 


Asn 


Gin 






250 










255 










260 








265 




Asp 


Ser 


Lys 


Ser 


Pro 


Leu 


Gly 


Phe 


Tyr 


Cys 


Asp 


Gin 


Asn 


Pro 


Val 


Glu 


Ser 


Ser 


Met 








270 










275 










280 










285 


Cys 


Gin 


Ser 


Asn 


Ser 


Arg 


Asp 


His 


Leu 


Ser 


Asp 


Lys 


Glu 


Ser 


Lys 


Glu 


Ser 


Ser 


Val 










290 










295 










300 










Glu 


Gly 


Ala 


Glu 


Asn 


Gin 


Arg 


Gly 


Pro 


Leu 


Glu 


Ser 


Lys 


Gly 


His 


Lys 


Lys 


Leu 


Leu 


305 










310 










315 










320 






Gin 


Leu 


Leu 


Thr 


Cys 


Ser 


Ser 


Asp 


Asp 


Arg 


Gly 


His 


Ser 


Ser 


Leu 


Thr 


Asn 


Ser 


Pro 




325 










330 










335 










340 






Leu 


Asp 


Ser 


Ser 


Cys 


Lys 


Glu 


Ser 


Ser 


Val 


Ser 


Val 


Thr 


Ser 


Pro 


Ser 


Gly 


Val 


Ser 






345 










350 










355 








360 




Ser 


Ser 


Thr 


Ser 


Gly 


Gly 


Val 


Ser 


Ser 


Thr 


Ser 


Asn 


Met 


His 


Gly 


Ser 


Leu 


Leu 


Gin 








365 










370 










375 








380 


Glu 


Lys 


His 


Arg 


He 


Leu 


His 


Lys 


Leu 


Leu 


Gin 


Asn 


Gly 


Asn 


Ser 


Pro 


Ala 


Glu 


Val 










385 










390 










395 










/\j.a 


Lys 


He 


Thr 


Ala 


Glu 


Ala 


Thr 


Gly 


Lys 


Asp 


Thr 


Ser 


Ser 


He 


Thr 


Ser 


Cys 


Gly 


400 










405 










410 










415 




Asp 


Gly 


Asn 


Val 


Val 


Lys 


Gin 


Glu 


Gin 


Leu 


Ser 


Pro 


Lys 


Lys 


Lys 


Glu 


Asn 


Asn 


Ala 




420 










425 










430 










435 






Leu 


Leu 


Arg 


Tyr 


Leu 


Leu 


Asp 


Arg 


Asp 


Asp 


Pro 


Ser 


Asp 


Ala 


Leu 


Ser 


Lys 


Glu 


Leu 






440 










445 










450 








455 




Gin 


Pro 


Gin 


Val 


Glu 


Gly 


-Val 


Asp 


Asn 


Lys 


Met 


Ser 


Gin 


Cys 


Thr 


Ser 


Ser 


Thr 


He 








460 










465 










470 










475 


Pro 


Ser 


Ser 


Ser 


Gin 


Glu 


Lys 


Asp 


Pro 


Lys 


He 


Lys 


Thr 


Glu 


Thr 


Ser 


Glu 


Glu 


Gly 










480 










485 










490 








Ser 


Gly Asp 


Leu 


Asp 


Asn 


Leu 


Asp 


Ala 


He 


Leu 


Gly 


Asp 


Leu 


Thr 


Ser 


Ser 


Asp 


Phe 


495 










500 










505 










510 






Tyr 


Asn 


Asn 


Ser 


He 


Ser 


Ser 


Asn 


Gly 


Ser 


His 


Leu 


Gly 


Thr 


Lys 


Gin 


Gin 


Val 


Phe 




515 










520 










525 








530 






Gin 


Gly 


Thr 


Asn 


Ser 


Leu 


Gly 


Leu 


Lys 


Ser 


Ser 


Gin 


Ser 


Val 


Gin 


Ser 


He 


Arg 


Pro 






535 










540 










545 










550 




Pro 


Tyr 


Asn 


Arg 


Ala 


Val 


Ser 


Leu 


Asp 


Ser 


Pro 


Val 


Ser 


Val 


Gly 


Ser 


Ser 


Pro 


Pro 








555 










560 










565 








570 
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Val 


Lys 


Asn 


lie 


Ser 


Ala 


Phe 


Pro 


Met 


Leu 


Pro 


Lys 


Gin 


Pro 


Met 


Leu 


Gly 


Gly 


Asn 












575 










580 








585 












Pro 


Arg 


Met 


Met 


Asp 


Ser 


Gin 


Glu 


Asn 


Tyr 


Gly 


Ser 


Ser 


Met 


Gly 


Gly 


Pro 


Asn 


Arg 




590 










595 










600 










605 








5 


Asn 


Val 


Thr 


Val 


Thr 


Gin 


Thr 


Pro 


Ser 


Ser 


Gly Asp 


Trp 


Gly 


Leu 


Pro 


Asn 


Ser 


Lys 






610 










615 










620 










625 






Ala 


Gly 


Arg 
630 


Met 


Glu 


Pro 


Met 


Asn 
635 


Ser 


Asn 


Ser 


Met 


Gly 
640 


Arg 


Pro 


Gly 


Gly 


Asp 
645 


Tyr 




Asn 


Thr 


Ser 


Leu 


Pro 


Arg 


Pro 


Ala 


Leu 


Gly 


Gly 


Ser 


He 


Pro 


Thr 


Leu 


Pro 


Lbu 


Arg 


10 








650 










655 










660 










665 




Ser 


Asn 


Ser 


lie 


Pro 
670 


Gly 


Ala 


Arg 


Pro 


Val 
675 


Leu 


Gin 


Gin 


Gin 


Gin 
680 


Gin 


Met 


Leu 


Gin 




Met 


Arg 


Pro 


Gly 


Glu 


lie 


Pro 


Met 


Gly 


Met 


Gly Ala 


Asn 


Pro 


Tyr 


Gly 


Gin 


Ala 


Ala 




685 










690 










695 










700 








15 


Ala 


Ser 
705 


Asn 


Gin 


Leu 


Gly 


Ser 
710 


Trp 


Pro 


Asp 


Gly 


Met 
715 


Leu 


Ser 


Met 


Glu 


Gin 
720 


Val 


Ser 




His 


Gly 


Thr 
725 


Gin 


Asn 


Arg 


Pro 


Leu 
730 


Leu 


Arg 


Asn 


Ser 


Leu 
735 


Asp 


Asp 


Leu 


Val 


Gly 
740 


Pro 




Pro 


Ser 


Asn 


Leu 


Glu 


Gly 


Gin 


Ser 


Asp 


Glu 


Arg 


Ala 


Leu 


Leu 


Asp 


Gin 


Leu 


His 


Thr 


20 








745 










750 










755 










760 




Leu 


Leu 


Ser 


Asn 


Thr 
765 


Asp 


Ala 


Thr 


Gly 


Leu 
770 


Glu 


Glu 


He 


Asp 


Arg 
775 


Ala 


Leu 


Gly 


He 




Pro 


Glu 


Leu 


Val 


Asn 


Gin 


Gly 


Gin 


Ala 


Leu 


Glu 


Pro 


Lys 


Gin 


Asp 


Ala 


Phe 


Gin 


Gly 




780 










785 










790 










795 








25 


Gin 


Glu 
800 


Ala 


Ala 


Val 


Met 


Met 
805 


Asp 


Gin 


Lys 


Ala 


Gly 
810 


Leu 


Tyr 


Gly 


Gin 


Thr 
815 


Tyr 


Pro 




Ala 


Gin 


Gly 


Pro 


Pro 


Met 


Gin 


Gly 


Gly 


Phe 


His 


Leu 


Gin. Gly 


Gin 


Ser 


Pro 


Ser 


Phe 








820 










825 










830 










835 






Asn 


Ser 


Met 


Met 


Asn 


Gin 


Met 


Asn 


Gin 


Gin 


Gly Asn 


Phe 


Pro 


Leu 


Gin 


Gly 


Met 


His 


30 








840 










845 










850 








855 




Pro 


Arg 


Ala 


Asn 


lie 
860 


Met 


Arg 


Pro 


Arg 


Thr 
865 


Asn 


Thr 


Pro 


Lys 


Gin 
870 


Leu 


Arg 


Met 


Gin 




Leu 


Gin 


Gin 


Arg 


Leu 


Gin 


Gly 


Gin 


Gin 


Phe 


Leu 


Asn 


Gin 


Ser 


Arg 


Gin 


Ala 


Leu 


Glu 




875 










880 










885 










890 








35 


Leu 


Lys 
895 


Met 


Glu 


Asn 


Pro 


Thr 
900 


Ala 


Gly 


Gly 


Ala 


Ala 
905 


Val 


Met 


Arg 


Pro 


Met 
910 


Met 


Gin 




Pro 


Gin 


Gin 
915 


Gly 


Phe 


Leu 


Asn 


Ala 
920 


Gin 


Met 


Val 


Ala 


Gin 
925 


Arg 


Ser 


Arg 


Glu 


Leu 
930 


Leu 




Ser 


His 


His 


Phe 


Arg 


Gin 


Gin 


Arg 


Val 


Ala 


Met 


Met 


Met 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


40 








935 










940 










945 










950 



Gin 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS : 
45 (A) LENGTH: 4621 nucleotides; 1539 amino acid residues 

(B) TYPE: mouse DNA and amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

50 

G GCG GCG AAC GGA TCA AAA GAA TTT GCT GAA CAG TGG ACT CCG AGA TCG GTA AAA 
15 10 15 





CGA 


ACT 


CTT 


CCC 


TGC 


CCT 


TCC 


TGA 


ACA 


GCT 


GTC 


AGT 


TGC 


TGA 


TCT 


GTG 


ATC 


AGG 




20 










25 










30 










35 






55 


ATG 


AGT 


GGA 


CTA 


GGC 


GAA 


AGC 


TCT 


TTG 


GAT 


CCG 


CTG 


GCC 


GCT 


GAG 


TCT 


CGG 


AAA 




Met 


Ser 


Gly 


Leu 


Gly 


Glu 


Ser 


Ser 


Leu 


Asp 


Pro 


Leu 


Ala 


Ala 


Glu 


Ser 


Arg 


Lys 








40 










45 










50 










55 




CGC 


AAA 


CTG 


CCC 


TGT 


GAT 


GCC 


CCA 


GGA 


CAG 


GGG 


CTT 


GTC 


TAC 


AGT 


GGT 


GAG 


AAG 




Arg 


Lys 


Leu 


Pro 


Cys 


Asp 


Ala 


Pro 


Gly 


Gin 


Gly 


Leu 


Val 


Tyr 


Ser 


Gly 


Glu 


Lys 


60 










60 










65 










70 










TGG 


CGA 


CGG 


GAG 


CAG 


GAG 


AGC 


AAG 


TAC 


ATA 


GAG 


GAG 


CTG 


GCA 


GAG 


CTC 


ATC 


TCT 




Trp 


Arg 


Arg 


Glu 


Gin 


Glu 


Ser 


Lys 


Tyr 


He 


Glu 


Glu 


Leu 


Ala 


Glu 


Leu 


He 


Ser 






75 










80 










85 










90 






GCA 


AAT 


CTC 


AGC 


GAC 


ATC 


GAC 


AAC 


TTC 


AAT 


GTC 


AAG 


CCA 


GAT 


AAA 


TGT 


GCC 


ATC 


65 


Ala 


Asn 


Leu 


Ser 


Asp 


He 


Asp 


Asn 


Phe 


Asn 


Val 


Lys 


Pro 


Asp 


Lys 


Cys 


Ala 


He 










95 










100 










105 












CTA 


AAG 


GAG 


ACA 


GTG 


AGA 


CAG 


ATA 


CGG 


CAA 


ATA 


AAA 


GAA 


CAA 


GGA 


AAA 


ACT 


ATT 




Leu 


Lys 


Glu 


Thr 


Val 


Arg 


Gin 


lie 


Arg 


Gin 


He 


Lys 


Glu 


Gin 


Gly 


Lys 


Thr 


He 



110 115 120 125 
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TCC 


AGT 


GAT 


GAT 


GAT 


GTT 


CAA 


AAA 


GCT 


GAT 


GTG 


TCT 


TCT 


ACA 


GGG 


CAG 


GGA 


GTC 




Ser 


Ser 


Asp 


Asp 


Asp 


Val 


Gin 


Lys 


Ala 


Asp 


Val 


Ser 


Ser 


Thr 


Gly 


Gin 


Gly 


Val 








130 










135 










140 










145 




ATT 


GAT 


AAA 


GAC 


TCT 


TTA 


GGA 


CCG 


CTT 


TTA 


CTA 


CAG 


GCA 


CTG 


GAT 


GGT 


TTC 


CTG 


5 


lie 


Asp 


Lys 


Asp 


Ser 


Leu 


Gly 


Pro 


Leu 


Leu 


Leu 


Gin 


Ala 


Leu 


Asp 


Gly 


Phe 


Leu 












150 










155 










160 










TTT 


GTG 


GTG 


AAT 


CGA 


GAT 


GGA 


AAC 


ATT 


GTA 


TTC 


GTG 


TCA 


GAA 


AAT 


GTC 


ACA 


CAG 




Phe 


Val 


Val 


Asn 


Arg 


Asp 


Gly 


Asn 


He 


Val 


Phe 


Val 


Ser 


Glu 


Asn 


Val 


Thr 


Gin 






165 










170 










175 










180 




10 


TAT 


CTG 


CAG 


TAC 


AAG 


CAG 


GAG 


GAC 


CTG 


GTT 


AAC 


ACA 


AGT 


GTC 


TAC 


AGC 


ATC 


TTA 




Tyr 


Leu 


Gin 


Tyr 


Lys 


Gin 


Glu 


Asp 


Leu 


Val 


Asn 


Thr 


Ser 


Val 


Tyr 


Ser 


He 


Leu 










185 










190 










195 












CAT 


GAG 


CAA 


GAC 


CGG 


AAG 


GAT 


TTT 


CTT 


AAA 


CAC 


TTA 


CCA 


A7^ 


TCC 


ACA 


GTT 


AAT 




His 


Glu 


Gin 


Asp 


Arg 


Lys 


Asp 


Phe 


Leu 


Lys 


His 


Leu 


Pro 


Lys 


Ser 


Thr 


Val 


Asn 


15 


200 










205 










210 










215 








GGA 


GTT 


TCT 


TGG 


ACT 


AAT 


GAG 


AAC 


CAG 


AGA 


CAA 


AAA 


AGC 


CAT 


ACA 


TTT 


AAT 


TGT 




Gly 


Val 


Ser 


Trp 


Thr 


Asn 


Glu 


Asn 


Gin 


Arg 


Gin 


Lys 


Ser 


His 


Thr 


Phe 


Asn 


Cys 








220 










225 










230 










235 




CGT 


ATG 


TTG 


ATG 


AAA 


ACA 


CAC 


GAC 


ATT 


TTG 


GAA 


GAC 


GTG 


AAT 


GCC 


AGT 


CCC 


GAA 


20 


Arg 


Met 


Leu 


Met 


Lys 


Thr 


His 


Asp 


He 


Leu 


Glu 


Asp 


Val 


Asn 


Ala 


Ser 


Pro 


Glu 












240 








245 








250 








ACA 


CGC 


CAG 


AGA 


TAT 


GAA 


ACA 


ATG 


CAG 


TGC 


TTT 


GCC 


CTG 


TCT 


CAG 


CCT 


CGC 


GCT 




Thr 


Arg 


Gin 


Arg 


Tyr 


Glu 


Thr 


Met 


Gin 


Cys 


Phe 


Ala 


Leu 


Ser 


Gin 


Pro 


Arg 


Ala 






255 










260 










265 










270 




25 


ATG 


CTG 


GAA 


GAA 


GGA 


GAA 


GAC 


TTG 


CAG 


TGC 


TGT 


ATG 


ATC 


TGC 


GTG 


GCT 


CGC 


CGC 




Met 


Leu 


Glu 


Glu 


Gly 


Glu 


Asp 


Leu 


Gin 


Cys 


Cys 


Met 


He 


Cys 


Val 


Ala 


Arg 


Arg 










275 










280 










285 










GTG 


ACT 


GCG 


CCA 


TTC 


CCA 


TCC 


AGT 


CCT 


GAG 


AGC 


TTT 


ATT 


ACC 


AGA 


CAT 


GAC 


CTT 




Val 


Thr 


Ala 


Pro 


Phe 


Pro 


Ser 


Ser 


Pro 


Glu 


Ser 


Phe 


He 


Thr 


Arg 


His 


Asp 


Leu 


30 


290 










295 










300 








305 






TCC 


GGA 


AAG 


GTT 


GTC 


AAT 


ATA 


GAT 


ACA 


AAC 


TCA 


CTT 


AGA 


TCT 


TCC 


ATG 


AGG 


CCT 




Ser 


Gly 


Lys 


Val 


Val 


Asn 


He 


Asp 


Thr 


Asn 


Ser 


Leu 


Arg 


Ser 


Ser 


Met 


Arg 


Pro 








310 










315 










320 










325 




GGC 


TTT 


GAA 


GAC 


ATA 


ATC 


CGA 


AGA 


TGT 


ATC 


CAG 


AGG 


TTC 


TTC 


AGT 


CTG 


AAT 


GAT 


35 


Gly 


Phe 


Glu 


Asp 


He 


He 


Arg 


Arg 


Cys 


He 


Gin 


Arg 


Phe 


Phe 


Ser 


Leu 


Asn 


Asp 












330 










335 










340 








GGG 


CAG 


TCA 


TGG 


TCC 


CAG 


AAG 


CGT 


CAC 


TAT 


CAA 


GAA 


GCT 


TAT 


GTT 


CAT 


GGC 


CAC 




Gly 


Gin 


Ser 


Trp 


Ser 


Gin 


Lys 


Arg 


His 


Tyr 


Gin 


Glu 


Ala 


Tyr 


Val 


His 


Gly 


His 






345 










350 










355 










360 




40 


GCA 


GAG 


ACC 


CCC 


GTG 


TAT 


CGT 


TTC 


TCC 


TTG 


GCT 


GAT 


GGA 


ACT 


ATT 


GTG 


AGT 


GCG 




Ala 


Glu 


Thr 


Pro 


Val 


Tyr 


Arg 


Phe 


Ser 


Leu 


Ala 


Asp 


Gly 


Thr 


He 


Val 


Ser 


Ala 










365 










370 










375 












CAG 


ACA 


AAA 


AGC 


AAA 


CTC 


TTC 


CGC 


AAT 


CCT 


GTA 


ACG 


AAT 


GAT 


CGT 


CAC 


GGC 


TTC 


45 


Gin 


Thr 


Lys 


Ser 


Lys 


Leu 


Phe 


Arg 


Asn 


Pro 


Val 


Thr 


Asn 


Asp 


Arg 


His 


Gly 


Phe 


380 










385 










390 










395 








ATC 


TCG 


ACC 


CAC 


TTT 


CTT 


CAG 


. AGA 


GAA 


CAG 


AAT 


GGA 


TAC 


AGA 


CCA 


AAC 


CCA 


AAT 




lie 


Ser 


Thr 


His 


Phe 


Leu 


Gin 


Arg 


Glu 


Gin 


Asn 


Gly 


Tyr 


Arg 


Pro 


Asn 


Pro 


Asn 


DU 






400 










405 










410 










415 


CCC 


GCA 


GGA 


CAA 


GGC 


ATC 


CGA 


CCT 


CCT 


GCA 


GCA 


GGG 


TGT 


GGC 


GTG 


AGC 


ATG 


TCT 




Pro 


Ala 


Gly 


Gin 


Gly 


He 


Arg 


Pro 


Pro 


Ala 


Ala 


Gly 


Cys 


Gly 


Val 


Ser 


Met 


Ser 












420 










425 










430 










CCA 


AAT 


CAG 


AAT 


GTA 


CAG 


ATG 


ATG 


GGC 


AGC 


CGG 


ACC 


TAT 


GGC 


GTG 


CCA 


GAC 


CCC 


55 


Pro 


Asn 


Gin 


Asn 


Val 


Gin 


Met 


Met 


Gly 


Ser 


Arg 


Thr 


Tyr 


Gly Val 


Pro 


Asp 


Pro 






435 










440 










445 










450 






AGC 


AAC 


ACA 


GGG 


CAG 


ATG 


GGT 


GGA 


GCT 


AGG 


TAC 


GGG 


GCT 


TCT 


AGT 


AGC 


GTA 


GCC 




Ser 


Asn 


Thr 


Gly 


Gin 


Met 


Gly 


Gly 


Ala 


Arg 


Tyr 


Gly 


Ala 


Ser 


Ser 


Ser 


Val 


Ala 










455 










460 










465 










60 


TCA 


CTG 


ACG 


CCA 


GGA 


CAA 


AGC 


CTA 


CAG 


TCG 


CCA 


TCT 


TCC 


TAT 


CAG 


AAC 


AGC 


AGC 




Ser 


Leu 


Thr 


Pro 


Gly 


Gin 


Ser 


Leu 


Gin 


Ser 


Pro 


Ser 


Ser 


Tyr 


Gin 


Asn 


Ser 


Ser 




470 










475 










480 










485 








TAT 


GGG 


CTC 


AGC 


ATG 


AGC 


AGT 


CCC 


CCC 


CAC 


GGC 


AGT 


CCT 


GGT 


CTT 


GGT 


CCC 


AAC 


65 


Tyr 


Gly 


Leu 


Ser 


Met 


Ser 


Ser 


Pro 


Pro 


His 


Gly 


Ser 


Pro 


Gly 


Leu 


Gly 


Pro 


Asn 






490 










495 










500 










505 




CAG 


CAG 


AAC 


ATC 


ATG 


ATT 


TCC 


CCT 


CGG 


AAT 


CGT 


GGC 


AGC 


CCA 


AAG 


ATG 


GCC 


TCC 




Gin 


Gin 


Asn 


He 


Met 


He 


Ser 


Pro 


Arg 


Asn 


Arg 


Gly 


Ser 


Pro 


Lys 


Met 


Ala 


Ser 












510 










515 










520 










CAC 


CAG 


TTC 


TCT 


CCT 


GCT 


GCA 


GGT 


GCA 


CAC 


TCA 


CCC 


ATG 


GGA 


CCT 


TCT 


GGC 


AAC 


70 


His 


Gin 


Phe 


Ser 


Pro 


Ala 


Ala 


Gly 


Ala 


His 


Ser 


Pro 


Met 


Gly 


Pro 


Ser 


Gly 


Asn 






525 










530 










535 










540 
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ACA 


GGG 


AGC 


CAC 


AGC 


TTT 


TCT 


AGC 


AGC 


TCC 


CTC 


AGT 


GCC 


TTG 


CAA 


GCC 


ATC 


AGT 


Thr 


Gly 


Ser 


His 


Ser 


Phe 


Ser 


Ser 


Ser 


Ser 


Leu 


Ser 


Ala 


Leu 


Gin 


Ala 


lie 


Ser 








545 










550 










555 










GAA 


GGC 


GTG 


GGG 


ACC 


TCT 


CTT 


TTA 


TCT 


ACT 


CTG 


TCC 


TCA 


CCA 


GGC 


CCC 


AAA 


CTG 


Glu 


Gly 


Val 


Gly 


Thr 


Ser 


Leu 


Leu 


Ser 


Thr 


Leu 


Ser 


Ser 


Pro 


Gly 


Pro 


Lys 


Leu 


560 










565 










570 








575 




GAT 


AAT 


TCT 


CCC 


AAT 


ATG 


AAT 


ATA 


AGC 


CAG 


CCA 


AGT 


AAA 


GTG 


AGT 


GGT 


CAG 


GAC 


Asp 


Asn 


Ser 


Pro 


Asn 


Met 


Asn 


lie 


Ser 


Gin 


Pro 


Ser 


Lys 


Val 


Ser 


Gly 


Gin 


Asp 
















585 










590 








595 


TCT 


AAG 


AGC 


CCC 


CTA 


GGC 


TTA 


TAC 


TGT 


GAA 


CAG 


AAT 


CCA 


GTG 


GAG 


AGT 


TCA 


GTG 


Ser 


Lys 


Ser 


Pro 


Leu 


Gly 


Leu 


Tyr 


Cys 


Glu 


Gin 


Asn 


Pro 


Val 


Glu 


Ser 


Ser 


Val 










600 










605 










610 








TGT 


CAG 


TCA 


AAC 


AGC 


AGA 


GAT 


CAC 


CCA 


AGT 


GAA 


AAA 


GAA 


AGC 


AAG 


GAG 


AGC 


AGT 


Cys 


Gin 


Ser 


Asn 


Ser 


Arg 


Asp 


His 


Pro 


Ser 


Glu 


Lys 


Glu 


Ser 


Lys 


Glu 


Ser 


Ser 




615 










620 










625 










630 




GGG 


GAG 


GTG 


TCA 


GAG 


ACG 


CCC 


AGG 


GGA 


CCT 


CTG 


GAA 


AGC 


AAA 


GGC 


CAC 


AAG 


AAA 


Gly 


Glu 


Val 


Ser 


Glu 


Thr 


Pro 


Arg 


Gly 


Pro 


Leu 


Glu 


Ser 


Lys 


Gly 


His 


Lys 


Lys 








635 










640 










645 






CTG 


CTG 


CAG 


TTA 


CTC 


ACG 


TGC 


TCC 


TCC 


GAC 


GAC 


CGA 


GGC 


CAT 


TCC 


TCC 


TTG 


ACC 


Leu 


Leu 


Gin 


Leu 


Leu 


Thr 


Cys 


Ser 


Ser 


Asp 


Asp 


Arg 


Gly 


His 


Ser 


Ser 


Leu 


Thr 


650 










655 










660 










665 






AAC 


TCT 


CCC 


CTG 


GAT 


CCA 


AAC 


TGC 


AAA 


GAC 


TCT 


TCC 


GTT 


AGT 


GTC 


ACC 


AGC 


CCC 


Asn 


Ser 


Pro 


Leu 


Asp 


Pro 


Asn 


Cys 


Lys 


Asp 


Ser 


Ser 


Val 


Ser 


Val 


Thr 


Ser 


Pro 






670 










675 










680 










685 


TCT 


GGA 


GTG 


TCC 


TCC 


TCA 


ACA 


TCA 


GGG 


ACA 


GTG 


TCT 


TCC 


ACC 


TCC 


AAT 


GTG 


CAT 


Ser 


Gly 


Val 


Ser 


Ser 


Ser 


Thr 


Ser 


Gly 


Thr 


Val 


Ser 


Ser 


Thr 


Ser 


Asn 


Val 


His 










690 










695 








700 










GGG 


TCT 


CTG 


TTG 


CAA 


GAG 


AAA 


CAC 


CGG 


ATT 


TTG 


CAC 


AAG 


TTG 


CTG 


CAG 


AAT 


GGC 


Gly 


Ser 


Leu 


Leu 


Gin 


Glu 


Lys 


His 


Arg 


lie 


Leu 


His 


Lys 


Leu 


Leu 


Gin 


Asn 


Gly 


705 










710 










715 










720 




AAC 


TCC 


CCA 


GCG 


GAG 


GTC 


GCC 


AAG 


ATC 


ACT 


GCA 


GAG 


GCC 


ACT 


GGG 


AAG 


GAC 


ACG 


Asn 


Ser 


Pro 


Ala 


Glu 


Val 


Ala 


Lys 


lie 


Thr 


Ala 


Glu 


Ala 


Thr 


Gly 


Lys 


Asp 


Thr 






725 










730 










735 






740 


AGC 


AGC 


ACT 


GCT 


TCC 


TGT 


GGA 


GAG 


GGG 


ACA 


ACC 


AGG 


CAG 


GAG 


CAG 


CTG 


AGT 


CCT 


Ser 


Ser 


Thr 


Ala 


Ser 


Cys 


Gly 


Glu 


Gly 


Thr 


Thr 


Arg 


Gin 


Glu 


Gin 


Leu 


Ser 


Pro 










745 










750 










755 








AAG 


AAG 


AAG 


GAG 


AAT 


AAT 


GCT 


CTG 


CTT 


AGA 


TAC 


CTG 


CTG 


GAC 


AGG 


GAT 


GAC 


CCC 


Lys 


Lys 


Lys 


Glu 


Asn 


Asn 


Ala 


Leu 


Leu 


Arg 


Tyr 


Leu 


Leu 


Asp 


Arg 


Asp 


Asp 


Pro 




760 










765 










770 










775 




AGT 


GAT 


GTG 


CTT 


GCC 


AAA 


GAG 


CTG 


CAG 


CCC 


CAG 


GCC 


GAC 


AGT 


GGG 


GAC 


AGT 


AAA 


Ser 


Asp 


Val 


Leu 


Ala 


Lys 


Glu 


Leu 


Gin 


Pro 


Gin 


Ala 


Asp 


Ser 


Gly 


Asp 


Ser 


Lys 








780 










785 










790 






CTG 


AGT 


CAG 


TGC 


AGC 


TGC 


TCC 


ACC 


AAT 


CCC 


AGC 


TCT 


GGC 


CAA 


GAG 


AAA 


GAC 


CCC 


Leu 


Ser 


Gin 


Cys 


Ser 


Cys 


Ser 


Thr 


Asn 


Pro 


Ser 


Ser 


Gly 


Gin 


Glu 


Lys 


Asp 


Pro 


7 95 










800 










805 










810 




AAA 


ATT 


AAG 


ACC 


GAG 


ACG 


AAC 


GAG 


GAG 


GTA 


TCG 


GGA 


GAC 


CTG 


GAT 


AAT 


CTA 


GAT 


Lys 


lie 


Lys 


Thr 


Glu 


Thr 


Asn 


Glu 


Glu 


Val 


Ser 


Gly 


Asp 


Leu 


Asp 


Asn 


Leu 


Asp 






815 










820 










825 










830 


GCC 


ATT 


CTT 


GGA 


GAT 


TTG 


ACC 


AGT 


TCT 


GAC 


TTC 


TAC 


AAC 


AAT 


CCT 


ACA 


AAT 


GGC 


Ala 


lie 


Leu 


Gly Asp 


Leu 


Thr 


Ser 


Ser 


Asp 


Phe 


Tyr 


Asn 


Asn 


Pro 


Thr 


Asn 


Gly 










835 










840 










845 






GGT 


CAC 


CCA 


GGG 


GCC 


AAA 


CAG 


CAG 


ATG 


TTT 


GCA 


GGA 


CCG 


AGT 


TCT 


CTG 


GGT 


TTG 


Gly 


His 


Pro 


Gly 


Ala 


Lys 


Gin 


Gin 


Met 


Phe 


Ala 


Gly 


Pro 


Ser 


Ser 


Leu 


Gly 


Leu 




850 










855 










860 










865 




CGA 


AGT 


CCA 


CAG 


CCT 


GTG 


CAG 


TCT 


GTT 


CGT 


CCT 


CCA 


TAT 


AAC 


CGA 


GCG 


GTG 


TCT 


Arg 


Ser 


Pro 


Gin 


Pro 


Val 


Gin 


Ser 


Val 


Arg 


Pro 


Pro 


Tyr 


Asn 


Arg 


Ala 


Val 


Ser 








870 










875 










880 








CTG 


GAT 


AGC 


CCT 


GTG 


TCT 


GTT 


GGC 


TCA 


GGT 


CCG 


CCA 


GTG 


AAG 


AAT 


GTC 


AGT 


GCT 


Leu 


Asp 


Ser 


Pro 


Val 


Ser 


Val 


Gly 


Ser 


Gly 


Pro 


Pro 


Val 


Lys 


Asn 


Val 


Ser 


Ala 


o o c; 
ooo 










890 










895 










900 






TTC 


CCT 


GGG 


TTA 


CCA 


AAA 


CAG 


CCC 


ATA 


CTG 


GCT 


GGG 


AAT 


CCA 


AGA 


ATG 


ATG 


GAT 




Pro 


Gly 


Leu 


Pro 


Lys 


Gin 


Pro 


lie 


Leu 


Ala 


Gly 


Asn 


Pro 


Arg 


Met 


Met 


Asp 






905 










910 










915 






920 


AGT 


CAG 


GAG 


AAT 


TAC 


GGT 


GCC 


AAC 


ATG 


GGC 


CCA 


AAC 


AGA 


AAT 


GTT 


CCT 


GTG 


AAT 


Ser 


Gin 


Glu 


Asn 


Tyr 


Gly 


Ala 


Asn 


Met 


Gly 


Pro 


Asn 


Arg 


Asn 


Val 


Pro 


Val 


Asn 








925 










930 










935 










CCG 


ACT 


TCC 


TCC 


CCC 


GGA 


GAC 


TGG 


GGC 


TTA 


GCT 


AAC 


TCA 


AGG 


GCC 


AGC 


AGA 


ATG 


Pro 


Thr 


Ser 


Ser 


Pro 


Gly Asp 


Trp 


Gly 


Leu 


Ala 


Asn 


Ser 


Arg 


Ala 


Ser 


Arg 


Met 


940 










945 










950 










955 




GAG 


CCT 


CTG 


GCA 


TCA 


AGT 


CCC 


CTG 


GGA 


AGA 


ACT 


GGA 


GCC 


GAT 


TAC 


AGT 


GCC 


ACT 
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Glu 


Pro 


Leu 


Ala 


Ser 


Ser 


Pro 


Leu 


Gly 


Arg 


Thr 


Gly 


Ala 


Asp 


Tyr 


Ser 


Ala 


Thr 






960 










965 










970 










975 


TTA 


CCC 


AGA 


CCT 


GCC 


ATG 


GGG 


GGC 


TCT 


GTG 


CCT 


ACC 


TTG 


CCA 


CTT 


CGT 


TCT 


AAT 


Leu 


Pro 


Arg 


Pro 


Ala 


Met 


Gly 


Gly 


Ser 


Val 


Pro 


Thr 


Leu 


Pro 


Leu 


Arg 


Ser 


Asn 










980 










985 










990 








CGA 


CTG 


CCA 


GGT 


GCA 


AGA 


CCA 


TCG 


TTG 


CAG 


CAA 


CAG 


CAG 


CAG 


CAA 


CAG 


CAG 


CAA 


Arg 


Leu 


Pro 


Gly 


Ala 


Arg 


Pro 


Ser 


Leu 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 




995 










1000 








1005 








1010 


CAG 


CAA 


CAA 


CAA 


CAG 


CAG 


CAA 


CAG 


CAG 


CAG 


CAA 


CAG 


CAG 


CAG 


CAG 


CAA 


CAG 


CAG 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 








1015 








1020 








1025 








CAG 


ATG 


CTT 


CAA 


ATG 


AGA 


ACT 


GGT 


GAG 


ATT 


CCC 


ATG 


GGA 


ATG 


GGA 


GTC 


AAT 


CCC 


Gin 


Met 


Leu 


Gin 


Met 


Arg 


Thr 


Gly 


Glu 


He 


Pro 


Met 


Gly 


Met 


Gly 


Val 


Asn 


Pro 


1030 








1035 








1040 








1045 




TAT 


AGC 


CCA 


GCA 


GTG 


CCG 


TCT 


AAC 


CAA 


CCA 


GGT 


TCC 


TGG 


CCA 


GAG 


GGC 


ATG 


CTC 


Tyr 


Ser 


Pro 


Ala 


Val 


Pro 


Ser 


Asn 


Gin 


Pro 


Gly 


Ser 


Trp 


Pro 


Glu 


Gly 


Met 


Leu 






1050 








1055 








1060 






10 65 


TCT 


ATG 


GAA 


CAA 


GGT 


CCT 


CAC 


GGG 


TCT 


CAA 


AAT 


AGG 


CCT 


CTT 


CTT 


AGA 


AAC 


TCT 


Ser 


Met 


Glu 


Gin 


Gly 


Pro 


His 


Gly 


Ser 


Gin 


Asn 


Arg 


Pro 


Leu 


Leu 


Arg 


Asn 


Ser 










1070 








1075 








1080 






CTG 


GAT 


GAT 


CTG 


CTT 


GGG 


CCA 


CCT 


TCT 


AAC 


GCA 


GAG 


GGC 


CAG 


AGT 


GAC 


GAG 


AGA 


Leu 


Asp 


Asp 


Leu 


Leu 


Gly 


Pro 


Pro 


Ser 


Asn 


Ala 


Glu 


Gly 


Gin 


Ser 


Asp 


Glu 


Arg 

) 




1085 








1090 








1095 






HOC 


GCT 


CTG 


CTG 


GAC 


CAG 


CTG 


CAC 


ACA 


CTC 


CTG 


AGC 


AAC 


ACA 


GAT 


GCC 


ACA 


GGT 


CTG 


Ala 


Leu 


Leu 


Asp 


Gin 


Leu 


His 


Thr 


Leu 


Leu 


Ser 


Asn 


Thr 


Asp 


Ala 


Thr 


Gly 


Leu 








1105 








1110 








1115 






GAG 


GAG 


ATC 


GAC 


AGG 


GCC 


TTG 


GGA 


ATT 


CCT 


GAG 


CTC 


GTG 


AAT 


CAG 


GGA 


CAA 


GCT 


Glu 


Glu 


He 


Asp 


Arg 


Ala 


Leu 


Gly 


He 


Pro 


Glu 


Leu 


Val 


Asn 


Gin 


Gly 


Gin 


Ala 


1120 








1125 








1130 








1135 




TTG 


GAG 


TCC 


AAA 


CAG 


GAT 


GTT 


TTC 


CAA 


GGC 


CAA 


GAA 


GCA 


GCA 


GTA 


ATG 


ATG 


GAT 


Leu 


Glu 


Ser 


Lys 


Gin 


Asp 


Val 


Phe 


Gin 


Gly 


Gin 


Glu 


Ala 


Ala 


Val 


Met 


Met 


Asp 






1140 








1145 








1150 








1155 


CAG 


AAG 


GCT 


GCA 


CTA 


TAT 


GGA 


CAG 


ACA 


TAC 


CCA 


GCT 


CAG 


GGT 


CCT 


CCC 


CTT 


CAA 


Gin 


Lys 


Ala 


Ala 


Leu 


Tyr 


Gly 


Gin 


Thr 


Tyr 


Pro 


Ala 


Gin 


Gly 


Pro 


Pro 


Leu 


Gin 










1160 








1165 








1170 






GGA 


GGC 


TTT 


AAC 


CTT 


CAG 


GGA 


CAG 


TCA 


CCA 


TCG 


TTT 


AAC 


TCT 


ATG 


ATG 


GGT 


CAG 


Gly 


Gly 


Phe 


Asn 


Leu 


Gin 


Gly 


Gin 


Ser 


Pro 


Ser 


Phe 


Asn 


Ser 


Met 


Met 


Gly 


Gin 




1175 








1180 








1185 








1190 


ATT 


AGC 


CAG 


CAA 


GGC 


AGC 


TTT 


CCT 


CTG 


CAA 


GGC 


ATG 


CAT 


CCT 


AGA 


GCC 


GGC 


CTC 


He 


Ser 


Gin 


Gin 


Gly 


Ser 


Phe 


Pro 


Leu 


Gin 


Gly 


Met 


His 


Pro 


Arg 


Ala 


Gly 


Leu 








1195 








1200 








1205 








GTG 


AGA 


CCA 


AGG 


ACC 


AAC 


ACC 


CCG 


AAG 


CAG 


CTG 


AGA 


ATG 


CAG 


CTT 


CAG 


CAG 


AGG 


Val 


Arg 


Pro 


Arg 


Thr 


Asn 


Thr 


Pro 


Lys 


Gin 


Leu 


Arg 


Met 


Gin 


Leu 


Gin 


Gin 


Arg 


1210 








1215 








1220 








1225 




CTA 


CAG 


GGC 


CAG 


CAG 


TTT 


TTA 


AAT 


CAG 


AGC 


CGG 


CAG 


GCA 


CTT 


GAA 


ATG 


AAA 


ATG 


Leu 


Gin 


Gly 


Gin 


Gin 


Phe 


Leu 


Asn 


Gin 


Ser 


Arg 


Gin 


Ala 


Leu 


Glu 


Met 


Lys 


Met 






1230 








1235 








1240 






1245 


GAG 


AAC 


CCT 


GCT 


GGC 


ACT 


GCT 


GTG 


ATG 


AGG 


CCC 


ATG 


ATG 


CCC 


CAG 


GCT 


TTC 


TTT 


Glu 


Asn 


Pro 


Ala 


Gly 


Thr 


Ala 


Val 


Met 


Arg 


Pro 


Met 


Met 


Pro 


Gin 


Ala 


Phe 


Phe 










1250 








1255 








1260 






AAT 


GCC 


CAA 


ATG 


GCT 


GCC 


CAG 


CAG 


AAA 


CGA 


GAG 


CTG 


ATG 


AGC 


CAT 


CAC 


CTG 


CAG 


Asn 


Ala 


Gin 


Met 


Ala 


Ala 


Gin 


Gin 


Lys 


Arg 


Glu 


Leu 


Met 


Ser 


His 


His 


Leu 


Gin 




1265 








1270 








1275 








1280 


CAG 


CAG 


AGG 


ATG 


GCG 


ATG 


ATG 


ATG 


TCA 


CAA 


CCA 


CAG 


CCT 


CAG 


GCC 


TTC 


AGC 


CCA 


Gin 


Gin 


Arg 


Met 


Ala 


Met 


Met 


Met 


Ser 


Gin 


Pro 


Gin 


Pro 


Gin 


Ala 


Phe 


Ser 


Pro 








1285 








1290 








1295 








CCT 


CCC 


AAC 


GTC 


ACC 


GCC 


TCC 


CCC 


AGC 


ATG 


GAC 


GGG 


GTT 


TTG 


GCA 


GGT 


TCA 


GCA 


Pro 


Pro 


Asn 


Val 


Thr 


Ala 


Ser 


Pro 


Ser 


Met 


Asp 


Gly 


Val 


Leu 


Ala 


Gly 


Ser 


Ala 


1300 








1305 








1310 








1315 




ATG 


CCG 


CAA 


GCC 


CCT 


CCA 


CAA 


CAG 


TTT 


CCA 


TAT 


CCA 


GCA 


AAT 


TAC 


GGA 


ATG 


GGA 


Met 


Pro 


Gin 


Ala 


Pro 


Pro 


Gin 


Gin 


Phe 


Pro 


Tyr 


Pro 


Ala 


Asn 


Tyr 


Gly 


Met 


Gly 






1320 








1325 








1330 




1335 


CAA 


CCA 


CCA 


GAG 


CCA 


GCC 


TTT 


GGT 


CGA 


GGC 


TCG 


AGT 


CCT 


CCC 


AGT 


GCA 


ATG 


ATG 


Gin 


Pro 


Pro 


Glu 


Pro 


Ala 


Phe 


Gly 


Arg 


Gly 


Ser 


Ser 


Pro 


Pro 


Ser 


Ala 


Met 


Met 










1340 








1345 








1350 






TCA 


TCA 


AGA 


ATG 


GGG 


CCT 


TCC 


CAG 


AAT 


GCC 


ATG 


GTG 


CAG 


CAT 


CCT 


CAG 


CCC 


ACA 


Ser 


Ser 


Arg 


Met 


Gly 


Pro 


Ser 


Gin 


Asn 


Ala 


Met 


Val 


Gin 


His 


Pro 


Gin 


Pro 


Thr 




1355 








1360 








1365 








1370 


CCC 


ATG 


TAT 


CAG 


CCT 


TCA 


GAT 


ATG 


AAG 


GGG 


TGG 


CCG 


TCA 


GGG 


AAC 


CTG 


GCC 


AGG 


Pro 


Met 


Tyr 


Gin 


Pro 


Ser 


Asp 


Met 


Lys 


Gly 


Trp 


Pro 


Ser 


Gly 


Asn 


Leu 


Ala 


Arg 
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1375 1380 1385 



AAT 


GGC 


TCC 


TTC 


CCC 


CAG 


CAG 


CAG 


TTT 


GCT 


CCC 


CAG 


GGG 


AAC 


CCT 


GCA 


GCC 


TAC 


Asn 


Gly 


Ser 


Phe 


Pro 


Gin 


Gin 


Gin 


Phe 


Ala 


Pro 


Gin 


Gly 


Asn 


Pro 


Ala 


Ala 


Tyr 






1390 








1395 








1400 








1405 


AAC 


ATG 


GTG 


CAT 


ATG 


AAC 


AGC 


AGC 


GGT 


GGG 


CAC 


TTG 


GGA 


CAG 


ATG 


GCC 


ATG 


ACC 


Asn 


Met 


Val 


His 


Met 


Asn 


Ser 


Ser 


Gly 


Gly 


His 


Leu 


Gly 


Gin 


Met 


Ala 


Met 


Thr 










1410 








1415 








1420 






CCC 


ATG 


CCC 


ATG 


TCT 


GGC 


ATG 


CCC 


ATG 


GGC 


CCC 


GAT 


CAG 


AAA 


TAC 


TGC 


TGA 


CAT 


Pro 


Met 


Pro 


Met 


Ser 


Gly 


Met 


Pro 


Met 


Gly 


Pro 


Asp 


Gin 


Lys 


Tyr 


Cys 


* ★ * 


His 




1425 








1430 








1435 








1440 


CTC 


CCT 


AGT 


GGG 


ACT 


GAC 


TGT 


ACA 


GAT 


GAC 


ACT 


GCA 


CAG 


GAT 


CAT 


CAG 


GAC 


GTG 


Leu 


Pro 


Ser 


Gly 


Thr 


Asp 


Cys 


Thr 


Asp Asp 


Thr 


Ala 


Gin 


Asp 


His 


Gin 


Asp 


Val 








1445 








1450 








1455 






GCG 


GCG 


AGT 


CAT 


TGT 


CTA 


AGC 


ATC 


CAG 


CTT 


GGA 


AAC 


AAG 


GCC 


AGC 


GTG 


ACC 


AGC 


Ala 


Ala 


Ser 


His 


Cys 


Leu 


Ser 


He 


Gin 


Leu 


Gly 


Asn 


Lys 


Ala 


Ser 


Val 


Thr 


Ser 


1460 








1465 








1470 






1475 




AGC 


GGG 


GTC 


TGT 


GCT 


GTC 


ATT 


TGA 


GCA 


GAG 


CTG 


GGT 


CTC 


GCT 


GAA 


GCG 


CAC 


TGT 


Ser 


Gly 


Val 


Cys 


Ala 


Val 


He 


* * * 


























1480 








1485 








1490 








1495 


CTA 


CCT 


GAT 


GCC 


CTG 


CCT 


CTG 


TGT 


GGC 


AAG 


GTG 


TTC 


TGC 


CTC 


ATG 


AGG 


ATG 


TGA 










1500 








1505 








1510 






TTC 


TGG 


AGA 


TGG 


GGT 


GTT 


CGT 


AAG 


CAC 


CGC 


TCT 


CTT 


ACG 


TCA 


CTC 


CCT 


TCT 


GCC 




1515 








1520 








1525 








1530 


TCG 


CCA 


GCC 


AAA 


GTC 


TTC 


ACG 


TAG 


ATC 


TAG 



















25 1535 1540 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

35 5 -TCCTTTTCCC AGC AGCAGTTTG-3 * 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 20 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:ll: 



45 5'ATGCCAGACATGGGCATGGG-3' 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 1539 
50 (B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

55 Met Ser Gly Leu Gly Glu Ser Ser Leu Asp Pro Leu Ala Ala Glu Ser Arg Lys 
40 45 50 55 

Arg Lys Leu Pro Cys Asp Ala Pro Gly Gin Gly Leu Val Tyr Ser Gly Glu Lys 

60 65 70 

Trp Arg Arg Glu Gin Glu Ser Lys Tyr lie Glu Glu Leu Ala Glu Leu lie Ser 
60 75 80 85 90 

Ala Asn Leu Ser Asp lie Asp Asn Phe Asn Val Lys Pro Asp Lys Cys Ala lie 

95 100 105 

Leu Lys Glu Thr Val Arg Gin lie Arg Gin lie Lys Glu Gin Gly Lys Thr He 
110 115 120 125 
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Ser 


Ser 


Asp 


Asp 


Asp 


Val 


Gin 


Lys 


Ala 


Asp 


Val 


Ser 


Ser 


Thr 


Gly 


Gin 


Gly 


Val 








130 










135 










140 










145 




lie 


Asp 


Lys 


Asp 


Ser 


Leu 


Gly 


Pro 


Leu 


Leu 


Leu 


Gin 


Ala 


Leu 


Asp 


Gly 


Phe 


Leu 












150 










155 










160 








f 


Phe 


Val 


Val 


Asn 


Arg 


Asp 


Gly 


Asn 


He 


Val 


Phe 


Val 


Ser 


Glu 


Asn 


Val 


Thr 


Gin 






165 










170 










175 










180 






Tyr 


Leu 


Gin 


Tyr 


Lys 


Gin 


Glu 


Asp 


Leu 


Val 


Asn 


Thr 


Ser 


Val 


Tyr 


Ser 


He 


Leu 










185 










190 










195 








1 A 


His 


Glu 


Gin 


Asp 


Arg 


Lys 


Asp 


Phe 


Leu 


Lys 


His 


Leu 


Pro 


Lys 


Ser 


Thr 


Val 


Asn 


200 










205 










210 








215 








Gly 


Val 


Ser 


Trp 


Thr 


Asn 


Glu 


Asn 


Gin 


Arg 


Gin 


Lys 


Ser 


His 


Thr 


Phe 


Asn 


Cys 








220 










225 










230 










235 




Arg 


Met 


Leu 


Met 


Lys 


Thr 


His 


Asp 


He 


Leu 


Glu 


Asp 


Val 


Asn 


Ala 


Ser 


Pro 


Glu 












240 








245 








250 






J, J 


Thr 


Arg 


Gin 


Arg 


Tyr 


Glu 


Thr 


Met 


Gin 


Cys 


Phe 


Ala 


Leu 


Ser 


Gin 


Pro 


Arg 


Ala 






o c c 










2 60 










265 










270 






Met 


Leu 


Glu 


Glu 


Gly 


Glu 


Asp 


Leu 


Gin 


Cys 


Cys 


Met 


He 


Cys 


Val 


Ala 


Arg 


Arg 










275 










280 










285 






on 


Val 


Thr 


Ala 


Pro 


Phe 


Pro 


Ser 


Ser 


Pro 


Glu 


Ser 


Phe 


He 


Thr 


Arg 


His 


Asp 


Leu 


2 90 










295 










300 








305 






Ser 


Gly 


Lys 


Val 


Val 


Asn 


He 


Asp 


Thr 


Asn 


Ser 


Leu 


Arg 


Ser 


Ser 


Met 


Arg 


Pro 








310 










315 










320 








325 


ZD 


Gly 


Phe 


Glu 


Asp 


He 


He 


Arg 


Arg 


Cys 


He 


Gin 


Arg 


Phe 


Phe 


Ser 


Leu 


Asn 


Asp 










330 










335 










340 








Gly 


Gin 


Ser 


Trp 


Ser 


Gin 


Lys 


Arg 


His 


Tyr 


Gin 


Glu 


Ala 


Tyr 


Val 


His 


Gly 


His 






345 










350 










355 








360 






Ala 


Glu 


Thr 


Pro 


Val 


Tyr 


Arg 


Phe 


Ser 


Leu 


Ala 


Asp 


Gly 


Thr 


He 


Val 


Ser 


Ala 


jU 








365 










370 










375 










Gin 


Thr 


Lys 


Ser 


Lys 


Leu 


Phe 


Arg 


Asn 


Pro 


Val 


Thr 


Asn 


Asp 


Arg 


His 


Gly 


Phe 




380 










385 










390 








395 






lie 


Ser 


Thr 


His 


Phe 


Leu 


Gin 


Arg 


Glu 


Gin 


Asn 


Gly 


Tyr 


Arg 


Pro 


Asn 


Pro 


Asn 








400 










405 










410 










415 




Pro 


Ala 


Gly 


Gin 


Gly 


He 


Arg 


Pro 


Pro 


Ala 


Ala 


Gly 


Cys 


Gly 


Val 


Ser 


Met 


Ser 










420 










425 






430 










Pro 


Asn 


Gin 


Asn 


Val 


Gin 


Met 


Met 


Gly 


Ser 


Arg 


Thr 


Tyr 


Gly 


Val 


Pro 


Asp 


Pro 






4 35 










440 










445 










450 






Ser 


Asn 


Thr 


Gly 


Gin 


Met 


Gly 


Gly 


Ala 


Arg 


Tyr 


Gly 


Ala 


Ser 


Ser 


Ser 


Val 


Ala 


40 








455 










460 










465 










Ser 


Leu 


Thr 


Pro 


Gly 


Gin 


Ser 


Leu 


Gin 


Ser 


Pro 


Ser 


Ser 


Tyr 


Gin 


Asn 


Ser 


Ser 




470 










475 










480 








485 








Tyr 


Gly 


Leu 


Ser 


Met 


Ser 


Ser 


Pro 


Pro 


His 


Gly 


Ser 


Pro 


Gly 


Leu 


Gly 


Pro 


Asn 








490 










495 










500 










505 


45 


Gin 


Gin 


Asn 


lie 


Met 


He 


Ser 


Pro 


Arg 


Asn 


Arg 


Gly 


Ser 


Pro 


Lys 


Met 


Ala 


Ser 










510 










515 










520 









His Gin Phe Ser Pro Ala Ala Gly Ala His Ser Pro Met Gly Pro Ser Gly Asn 

525 530 535 540 

Thr Gly Ser His Ser Phe Ser Ser Ser Ser Leu Ser Ala Leu Gin Ala He Ser 
545 550 555 

50 Glu Gly Val Gly Thr Ser Leu Leu Ser Thr Leu Ser Ser Pro Gly 
560 565 570 

Asp Asn Ser Pro Asn Met Asn He Ser Gin Pro Ser Lys Val Ser 

580 585 590 

Ser Lys Ser Pro Leu Gly Leu Tyr Cys Glu Gin Asn Pro Val Glu 
55 600 605 610 

Cys Gin Ser Asn Ser Arg Asp His Pro Ser Glu Lys Glu Ser Lys 

615 620 625 

Gly Glu Val Ser Glu Thr Pro Arg Gly Pro Leu Glu Ser Lys Gly 
635 640 645 



Leu 


Leu 


Gin 


Leu 


Leu 


Thr 


Cys 


Ser 


Ser 


Asp 


Asp 


Arg 


Gly 


His 


Ser 


Ser 


Leu 


Thr 


650 










655 










660 










665 






Asn 


Ser 


Pro 


Leu 


Asp 


Pro 


Asn 


Cys 


Lys 


Asp 


Ser 


Ser 


Val 


Ser 


Val 


Thr 


Ser 


Pro 






670 










675 










680 










685 


Ser 


Gly 


Val 


Ser 


Ser 


Ser 


Thr 


Ser 


Gly 


Thr 


Val 


Ser 


Ser 


Thr 


Ser 


Asn 


Val 


His 










690 










695 








700 










Gly 


Ser 


Leu 


Leu 


Gin 


Glu 


Lys 


His 


Arg 


He 


Leu 


His 


Lys 


Leu 


Leu 


Gin 


Asn 


Gly 


705 










710 










715 










720 




Asn 


Ser 


Pro 


Ala 


Glu 


Val 


Ala 


Lys 


He- 


Thr 


Ala 


Glu 


Ala 


Thr 


Gly 


Lys 


Asp 


Thr 






725 










730 










735 










740 


Ser 


Ser 


Thr 


Ala 


Ser 


Cys 


Gly 


Glu 


Gly 


Thr 


Thr 


Arg 


Gin 


Glu 


Gin 


Leu 


Ser 


Pro 



745 750 755 



Pro 


Lys 


Leu 


575 






Gly 


Gin 


Asp 






595 


Ser 


Ser 


Val 


Glu 


Ser 


Ser 




630 




His 


Lys 


Lys 
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Lys Lys Lys Glu Asn Asn Ala Leu Leu Arg Tyr Leu Leu Asp Arg Asp Asp Pro 

760 765 770 775 

Ser Asp Val Leu Ala Lys Glu Leu Gin Pro Gin Ala Asp Ser Gly Asp Ser Lys 
780 785 790 

5 Leu Ser Gin Cys Ser Cys Ser Thr Asn Pro Ser Ser Gly Gin Glu Lys Asp Pro 
795 800 805 810 

Lys lie Lys Thr Glu Thr Asn Glu Glu Val Ser Gly Asp Leu Asp Asn Leu Asp 
815 820 825 830 

Ala lie Leu Gly Asp Leu Thr Ser Ser Asp Phe Tyr Asn Asn Pro Thr Asn ^ly 
10 835 840 845 

Gly His Pro Gly Ala Lys Gin Gin Met Phe Ala Gly Pro -Ser Ser Leu Gly Leu 

850 855 860 865 

Arg Ser Pro Gin Pro Val Gin Ser Val Arg Pro Pro Tyr Asn Arg Ala Val Ser 
870 875 880 

15 Leu Asp Ser Pro Val Ser Val Gly Ser Gly Pro Pro Val Lys Asn Val Ser Ala 
885 890 895 900 

Phe Pro Gly Leu Pro Lys Gin Pro lie Leu Ala Gly Asn Pro Arg Met Met Asp 

905 910 915 920 

Ser Gin Glu Asn Tyr Gly Ala Asn Met Gly Pro Asn Arg Asn Val Pro Val Asn 
20 925 930 935 

Pro Thr Ser Ser Pro Gly Asp Trp Gly Leu Ala Asn Ser Arg Ala Ser Arg Met 
940 945 950 955 

Glu Pro Leu Ala Ser Ser Pro Leu Gly Arg Thr Gly Ala Asp Tyr Ser Ala Thr 
960 965 970 975 

25 Leu Pro Arg Pro Ala Met Gly Gly Ser Val Pro Thr Leu Pro Leu Arg Ser Asn 

980 985 990 

Arg Leu Pro Gly Ala Arg Pro Ser Leu Gin Gin Gin Gin Gin Gin Gin Gin Gin 

995 1000 1005 1010 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
30 1015 1020 1025 

Gin Met Leu Gin Met Arg Thr Gly Glu lie Pro Met Gly Met Gly Val Asn Pro 
1030 1035 1040 1045 

Tyr Ser Pro Ala Val Pro Ser Asn Gin Pro Gly Ser Trp Pro Glu Gly Met Leu 
1050 1055 1060 1065 

35 Ser Met Glu Gin Gly Pro His Gly Ser Gin Asn Arg Pro Leu Leu Arg Asn Ser 

1070 1075 1080 

Leu Asp Asp Leu Leu Gly Pro Pro Ser Asn Ala Glu Gly Gin Ser Asp Glu Arg 

1085 1090 1095 1100 

Ala Leu Leu Asp Gin Leu His Thr Leu Leu Ser Asn Thr Asp Ala Thr Gly Leu 
40 1105 1110 1115 

Glu Glu lie Asp Arg Ala Leu Gly lie Pro Glu Leu Val Asn Gin Gly Gin Ala 
1120 1125 1130 1135 

Leu Glu Ser Lys Gin Asp Val Phe Gin Gly Gin Glu Ala Ala Val Met Met Asp 
1140 1145 1150 1155 

45 Gin Lys Ala Ala Leu Tyr Gly Gin Thr Tyr Pro Ala Gin Gly Pro Pro Leu Gin 

1160 1165 1170 

Gly Gly Phe Asn Leu Gin Gly Gin Ser Pro Ser Phe Asn Ser Met Met Gly Gin 

1175 1180 1185 1190 

lie Ser Gin Gin Gly Ser Phe Pro Leu Gin Gly Met His Pro Arg Ala Gly Leu 
50 1195 1200 1205 

Val Arg Pro Arg Thr Asn Thr Pro Lys Gin Leu Arg Met Gin Leu Gin Gin Arg 
1210 1215 1220 1225 

Leu Gin Gly Gin Gin Phe Leu Asn Gin Ser Arg Gin Ala Leu Glu Met Lys Met 
1230 1235 1240 1245 

55 Glu Asn Pro Ala Gly Thr Ala Val Met Arg Pro Met Met Pro Gin Ala Phe Phe 

1250 1255 1260 

Asn Ala Gin Met Ala Ala Gin Gin Lys Arg Glu Leu Met Ser His His Leu Gin 

1265 1270 1275 1280 

Gin Gin Arg Met Ala Met Met Met Ser Gin Pro Gin Pro Gin Ala Phe Ser Pro 
60 1285 1290 1295 

Pro Pro Asn Val Thr Ala Ser Pro Ser Met Asp Gly Val Leu Ala Gly Ser Ala 
1300 1305 1310 1315 

Met Pro Gin Ala Pro Pro Gin Gin Phe Pro Tyr Pro Ala Asn Tyr Gly Met Gly 
1320 1325 1330 1335 

65 Gin Pro Pro Glu Pro Ala Phe Gly Arg Gly Ser Ser Pro Pro Ser Ala Met Met 

1340 1345 1350 

Ser Ser Arg Met Gly Pro Ser Gin Asn Ala Met Val Gin His Pro Gin Pro Thr 

1355 1360 1365 1370 

Pro Met Tyr Gin Pro Ser Asp Met Lys Gly Trp Pro Ser Gly Asn Leu Ala Arg 
70 1375 1380 1385 

Asn Gly Ser Phe Pro Gin Gin Gin Phe Ala Pro Gin Gly Asn Pro Ala Ala Tyr 
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1390 1395 1400 1405 

Asn Met Val His Met Asn Ser Ser Gly Gly His Leu Gly Gin Met Ala Met Thr 

1410 1415 1420 

Pro Met Pro Met Ser Gly Met Pro Met Gly Pro Asp Gin Lys Tyr Cys *** His 
5 1425 1430 1435 1440 

Leu Pro Ser Gly Thr Asp Cys Thr Asp Asp Thr Ala Gin Asp His Gin Asp Val 

1445 1450 1455 

Ala Ala Ser His Cys Leu Ser lie Gin Leu Gly Asn Lys Ala Ser Val Thr Ser 
1460 1465 1470 1475 

10 Ser Gly Val Cys Ala Val lie *** 

1480 1485 1490 1495 
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What is claimed is: 

I. A substantially pure DNA comprising a sequence encoding an AIBl polj^eptide. 

5 2. The DNA of claim 1, wherein the polypeptide is human AIBl. - 

3 . The DNA of claim 1 , wherein the polypeptide comprises the amino acid sequence of 
SEQ. I.D. NO. 4. 

10 4. The DNA of claim 1, wherein the polypeptide comprises the amino acid sequence of 

SEQ. I.D. NO. 2. 

5 . The DNA of claim 1 , wherein the polypeptide comprises the amino acid sequence of 
SEQ. I.D, NO. 3, 

15 

6. The DNA of claim 1, wherein the polypeptide comprises the amino acid sequence of 
SEQ. I.D. NO. 8. 

7. A substantially pure DNA comprising a polynucleotide which hybridizes at high 
20 stringency to a DNA having the sequence of SEQ. I.D. NO. 1, or the complement thereof. 

8. A substantially pure DNA comprising a nucleotide sequence having at least 50% 
sequence identity to SEQ. I.D. NO. 1, the nucleotide sequence encoding a polypeptide having the 
biological activity of a AIBl polypeptide. 

25 

9. A substantially pure DNA comprising (a) the sequence of SEQ. I.D. NO. 1 or (b) a 
degenerate variant thereof. 

10. The DNA of claim 1. wherein the DNA is operably linked to regulatory sequences 
for expression of the polypeptide, the regulatory sequences comprising a promoter. 

30 

II. A cell comprising the DNA of claim 1 . 

12. A substantially pure human AIBl polypeptide. 

35 13. The polypeptide of claim 12, wherein the polypeptide comprises the amino acid 

sequence of SEQ. I.D. Nos. 2, 3. 4. or 8. ~ 
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14. A method of identifying a candidate compound which inhibits estrogen receptor 
(ER)-dependent transcription comprising contacting the compound with an AIBl polypeptide and 
determining whether the compoimd binds to the polypeptide, wherein binding of the compound to 
the polypeptide indicates that the compoimd inhibits ER-dependent transcription. 

5 

15. The method of claim 14, wherein the AIBl polypeptide comprises a Per/Arat/Sim 
(PAS) domain. 

16. The method of claim 14, wherein the AIBl polypeptide comprises a basic helix- 
10 loop-helix (bHLH) domain. 

17. The method of claim 14, wherein the AIBl polypeptide comprises an ER-interacting 

domain. 

15 18. A method of identifying a candidate compound which inhibits ER-dependent 

transcription comprising: 

contacting the compound with an AIBl polypeptide and an ER polypeptide and 
determining the ability of the compound to interfere with the binding of the ER polypeptide with 
the AIBl polypeptide, 

20 

19. The method of claim 18, wherein the AIBl polypeptide comprises a PAS domain. 

20. The method of claim 18, wherein the AIBl polypeptide comprises a bHLH domain. 

25 21 . A method of screening a candidate compound which inhibits an interaction of an 

AIBl polypeptide with an ER polypeptide in a cell comprising 

(a) providing a GAL4 binding site linked to a reporter gene; 

(b) providing a GAL4 binding domain linked to either (i) an AIBl polypeptide or (ii) an 
ER polypeptide; 

30 (c) providing a GAL4 transactivation domain II linked to the ER polypeptide if the 

GAL4 binding domain is linked to the AIBl polypeptide or linked to the AIBl polypeptide if the 
GAL4 binding domain is linked to the ER polypeptide; 

(d) contacting the cell with the compoimd; and 

(e) monitoring expression of the reporter gene, wherein a decrease in expression in the 
35 presence of the compound compared to that in the absence of the compound indicates that the 

compound inhibits an interaction of an AIBl polypeptide with the ER polypeptide. - 
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22. A method of detecting an aberrantly proliferating cell in a tissue sample comprising 
determining the level of AIBl gene expression in the sample, wherein an increase in the level of 
expression compared to the level in normal control tissue indicates the presence of an aberrantly 
proliferating cell. 

5 

23. The method of claim 21, wherein the aberrantly proliferating cell is a steroid 
hormone-responsive cancer cell. 

24. The method of claim 23, wherein the steroid hormone-responsive cancer cell is a 
10 breast cancer cell. 

25. The method of claim 23, wherein the cell is a steroid hormone-responsive cancer 
cell is an ovarian cancer cell. 

15 26. The method of claim 21, wherein the AIBl gene expression is measured using an 

AIBl gene-specific polynucleotide probe. 

27. The method of claim 21, wherein the AIBl gene expression is measured using an 
antibody specific for an AIBl gene product. 

28. A method of detecting breast cancer in a tissue sample, comprising determining the 
number of cellular copies of an AIBl gene in the tissue sample, wherein an increase in the nmnber 
of copies compared to the number of copies in a normal control tissue indicates the presence of a 
breast carcinoma. 

29. The method of claim 28, wherein the number of copies in the tissue is greater than 

2. 

30. The method of claim 29, wherein the number of copies in the tissue is greater than 
30 10. 

31. The method of claim 30, wherein the mmiber of copies in the tissue is greater than 

20. 



20 



25 



35 



32. A method of reducing proliferation of a cancer cell in a manunal comprising 
administering to the mammal a compound which inhibits expression of AIBl . 
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33. The method of claim 32, wherein the compound reduces transcription of DNA 
encoding AIBl in the cell. 

34. The method of claim 32, wherein the compound reduces translation of an AIBl 
5 mRNA into an AIBl gene product in the cell. 

35. The method of claim 34, wherein the translation is reduced by contacting the AIBl 
mRNA with an antisense DNA complementary to the AIBl mRNA. 

10 36. A method of inhibiting ER-dependent transcription in a breast cell of an mammal, 

comprising administering an effective amount of an AIBl polypeptide to the mammal. 

37. The method of claim 36, wherein the polypeptide comprises a PAS domain. 

15 38. The method of claim 36, wherein the polypeptide comprises a bHLH domain. 

39. The method of claim 36, wherein the polypeptide comprises an ER-interacting 

domain 

20 40. A method of inhibiting ER-dependent transcription in a cancer cell of a mammal, 

comprising administering an effective amount of a peptide mimetic of an AIBl polypeptide to the 
mammal. 

41. A monoclonal antibody which binds specifically to AIBl. 

25 

42. A method of identifying a tamoxifen-sensitive patient, comprising 

(a) contacting a patient-derived tissue sample with tamoxifen; and 

(b) determining the level of AIBl gene expression in the sample, wherein an increase in 
the level of expression compared to the level in normal control tissue indicates thai the patient is 

30 tamoxifen-sensitive . 

43. The method of claim 42, wherein the AIBl gene expression is measured usmg an 
AIBl gene-specific polynucleotide probe. 



35 



44. The method of claim 42, wherein the AIBl gene expression is measured using an 
antibody specific for an AIBl gene product. 
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45. A transgenic animal wherein at least one copy of the AIBl gene has been 
functionally deleted. 

46. A transgenic mouse wherein at least one copy of the pCIP gene has been 

5 functionally deleted. _ 

47. The invention of claim 45 wherein at least one copy of the gene has been 
functionally deleted using a method selected from the group consisting of: anti-sense technology, 
transposon mutagenesis, homologous recombination with a non- functional gene homolog of AIBl. 

10 

48. A transgenic animal genetically engineered to have more than the normal copy 
number of the AIBl gene. 

49. The invention of claim 48 wherein at least one copy of the AIBl gene has been 
15 introduced into the animal on an extra-chromosomal element. 



50. A transgenic animal having at least one AIBl gene operatively linked to a non- 
native promoter. 

20 51. The invention of claim 50 wherein the non-native promoter is selected from the 

group consisting of: a mouse mammary tumor virus promoter, a whey acidic protein promoter and 
a metallothionein promoter. 

52. The invention of claim 50 wherein transcription from the promoter has the 

25 characteristic selected from the group consisting of: being inducible, being repressible and being 
constitutive. 

53. A method of reducing proliferation of a cancer cell comprising administering to the 
mammal a compoimd which inhibits interaction of AIBl with a molecule selected from the group 

30 consisting of steroid receptors and nuclear co-factors. 

54. The method of claim 58 wherein the molecule is selected from the group consisting 
. of: p300 and CBP. 
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